UPDATE: I have another article on this topic called PASTE primer. Which covers the simplest way to join any number of lines without AWK or SED which is much easier.


…Joining Lines…
Imagine a file with the following text, this will be our test subject.

Reading the test.txt with cat.

cat test.txt

Has the output of this

1
2
3
4
5
6
7
8
9
10
11

How do we join every other line? (this is 2 lines joined)

3 ways (Siting: http://stackoverflow.com/questions/8987257/concatenating-every-other-line-with-the-next)

With SED:

cat test.txt | sed '$!N;s/\n/ /'

Results of the above cat to sed pipe

1 2
3 4
5 6
7 8
9 10
11

With PASTE:

# running any one of the following variations of paste and cat returns the same result
paste -s -d ' \n' test.txt
cat test.txt | paste -s -d ' \n'
cat test.txt | paste -d" " - -

Result with paste and cat:

1 2
3 4
5 6
7 8
9 10
11

 With AWK:

# Using any 4 of the following variations of awk and cat together will return the same results
awk '{getline b;printf("%s %s\n",$0,b)}' test.txt
cat test.txt | awk '{getline b;printf("%s %s\n",$0,b)}' 
awk '{getline b;printf "%s %s\n",$0,b}' test.txt
cat test.txt | awk '{getline b;printf "%s %s\n",$0,b }'

Here are the results:

1 2
3 4
5 6
7 8
9 10
11

NOTE: the 2 different ways to excute
1. cat the file and pipe to awk or read the file with awk
2. have printf be used with () or without parenthesis, either way all achieve the same result

What happens above with the awk script is pretty simple.
Awk executes the command/commands in {} for every line
The 2 commands are
getline b;
printf(“%s %s\n”,$0,b);
First lets look at print line
It takes 2 strings variables (which are the %s ) and prints them side by side followed by a newline char (\n )
The first string is $0 , which means current string. The following string on the same line and just a space away is the b variable. But whats the b variable?
Looking at the previous line “getline b ” we see that b is used. Getline gets the next line for awk and stores it in the given variable. The best part is that on the next run of awk, the current string will not be the one that was just read by getline, but one ahead
So if we break it down

1 # <- first time awk runs this is $0
2 # <- still on the first time awk is running, getline b, stores this value in b
# the above two turn to: 1 2
3 # <- 2nd time awk runs this is in $0 (because getline command from before moved the current-line one line forward)
4 # <- still on the 2nd time awk is running, getline b, stores this value in b
# the above two turn to: 3 4
5 # <- 3rd time awk runs this is in $0 (because getline command from before moved the current-line one line forward)
6 # <- still on the 3rd time awk is running, getline b, stores this value in b
# the above two turn to: 5 6
7 # <- 4th time awk runs this is in $0 (because getline command from before moved the current-line one line forward)
8 # <- still on the 4th time awk is running, getline b, stores this value in b
# the above two turn to: 7 8
9 # <- 5th time awk runs this is in $0 (because getline command from before moved the current-line one line forward)
10 # <- still on the 5th time awk is running, getline b, stores this value in b
# the above two turn to: 9 10
11 # <- 6th time awk runs this is in $0 (because getline command from before moved the current-line one line forward)
# the above two turn to: 11
# NOTE: getline on the 5th time awk is run just doesnt fill up with any value.

 3 lines joined?

 

Note just add a getline c (had to use a different variable for c). Recall that getline moves the currentline one step forward. getline always gets the next line

so that

$0 - is the current line
b - is the current line + 1 (next line)
c - is the current line + 2 (after next line)
cat test.txt | awk '{getline b; getline c;printf("%s %s %s\n",$0,b,c)}'

Output/Results:

1 2 3
4 5 6
7 8 9
10 11

NOTE: the above 4 different ways to run using the 2 different ways to execute still apply. Im just showing one variation of it, because it gets the point across. This goes true for the next example as well

4 lines joined?

cat test.txt | awk '{getline b; getline c; getline d;printf("%s %s %s %s\n",$0,b,c,d)}'

Output/Results:

1 2 3 4
5 6 7 8
9 10 11

N lines joined?

You get the pattern. I hope…

Just go ahead add a
getline X;
and dont forget to add a ” %s”  to the printf along with the ,X  outside of the double quotes so that the %s can get filled with the value of X. Where X is a different variable thats not been used

The end..

Leave a Reply

Your email address will not be published. Required fields are marked *