[SOLVED] Joining Three Lines of Text into One Line with Delimiters Between the Original Lines

Paulo2 · 03-16-2016, 01:22 PM

A bash solution

Code:

while read line;do let ++i;[ $i -lt 3 ] && echo -en "$line\t" || { i=;echo "$line";continue;};done <<<'1First_name Last_name
1House_Number Street_Name Street_Type
1City State Zip
2First_name Last_name
2House_Number Street_Name Street_Type
2City State Zip
3First_name Last_name
3House_Number Street_Name Street_Type
3City State Zip'
1First_name Last_name	1House_Number Street_Name Street_Type	1City State Zip
2First_name Last_name	2House_Number Street_Name Street_Type	2City State Zip
3First_name Last_name	3House_Number Street_Name Street_Type	3City State Zip

tronayne could you please post time for all those solutions?

Just curious which one is faster.

tronayne · 03-16-2016, 02:06 PM

Quote:

Originally Posted by Paulo2

A bash solution

Code:

while read line;do let ++i;[ $i -lt 3 ] && echo -en "$line\t" || { i=;echo "$line";continue;};done <<<'1First_name Last_name
1House_Number Street_Name Street_Type
1City State Zip
2First_name Last_name
2House_Number Street_Name Street_Type
2City State Zip
3First_name Last_name
3House_Number Street_Name Street_Type
3City State Zip'
1First_name Last_name	1House_Number Street_Name Street_Type	1City State Zip
2First_name Last_name	2House_Number Street_Name Street_Type	2City State Zip
3First_name Last_name	3House_Number Street_Name Street_Type	3City State Zip

tronayne could you please post time for all those solutions?

Just curious which one is faster.

Well, that looks pretty slick, thanks.

So far, AWK goes like a striped ape (and it's a one-liner). Have to fiddle with more than one and see what's what, though. Actually, sed and paste fly too -- not really enough data to actually measure, typically about 1 or two seconds. So far, I like the AWK solution(s).

riwi · 03-16-2016, 02:49 PM

I did not know awk was written by the same guy Kernighan that wrote _the_ book on C as well (Kernighan & Ritchie) which I ploughed through back in 1995 and loved.

Quote:

Originally Posted by Richard Cranium

I'd just rather not, and probably because the "There's more than one way to do it" is a horrible thing to inflict upon the reader of your code for anything remotely complicated.

(Please don't take the above as a comment about the code that you posted. I just hate Perl.)

Perl is a lot faster than using various passes with sed / awk and other bash scripting. Especially when things get more complicated I would use perl instead of awk. I like awk only for manipulating a few strings.

And you are right, perl can be illegible too. But in practice the same goes for awk. It is upto the programmer to create something that is easy to maintain and understand. That is why I coded the perl option using several lines and not one

tronayne · 03-16-2016, 03:18 PM

Actually, there were three guys that wrote AWK: Alfred Aho, Brian Kernighan and Peter Weinberger at Bell Labs. There is an "old" AWK and a "new" AWK. oawk came with System 3, nawk with System V. System VR4 still comes with oawk and nawk (so does Solaris). The nawk book was published in 1988, oawk was, pretty much, a man page and maybe a paper or two. The joke used to be that the name was short for awkward given the grammar and syntax but you notice that the syntax bears a lot of resemblance to C with regular expressions thrown in for good measure. I've written a lot of AWK programs over the years, it does a lot for you and, if you know C, it's fairly easy to get the idea. And, from '88 to now, nobody's actually figured out a way to make it better (GNU extensions notwithstanding -- I refuse to use extensions 'cause they've burned my butt more than once in many languages over the years).

Anyway, as above, you can get "real" AWK from Brian Kernighan's web site at Princeton (where he teaches).

kjhambrick · 03-16-2016, 05:36 PM

Quote:

Originally Posted by tronayne

BTW, you can go to Brian Kernighan's web site, http://www.cs.princeton.edu/~bwk/, and download the source for a Linux version of AWK (New AWK, the one from the book, with updates and fixes). Sometimes that page is a little iffy, sometimes it's not (if you can't get to it, I'll e-mail to you if you want).

Thanks tronayne,

I already have it for nostalgia's sake<G>

I thought about wrapping it in a SlackBuild script but I've never gotten around to it.

nawk sits in my /usr/local/bin/ directory collecting dust these days

-- kjh

GazL · 03-16-2016, 06:36 PM

I've been playing with sed this evening: trying to modify my example above to detect an arbitrary number of lines per record rather than hardcoding for 3 lines of input. All I can say is, I now know why they wrote awk!

Anyway, here it is:

Code:

#!/usr/bin/sed -f

/^$/ b print

x
/^$/! {
  G
  h
  $ b print
} 
d

: print
  x
  /^$/d
  s/\n/\t/g

Didier Spaier · 03-18-2016, 06:23 AM

Not strictly answering the initial question, but something a little more general: merging consecutive lines (replacing the newline by a tabulation) until the next empty line as in GazL's previous post.
File input.txt:

Code:

I thought
that it could be fun
to write a poem,

unfortunately
I am not able

to do
that.
Sorry about that and
long life Thomas.

I am the last line.

Command:

Code:

sed -n -f mergelines.sed input.txt

File mergelines.sed (press [TAB] only once where you see a white space as \t is not posixly correct):

Code:

:a;${p;q};N;/	\n/{s/	\n/\n/;P;D};s/\n/	/;ba

Result:

Code:

I thought    that it could be fun    to write a poem,
unfortunately    I am not able
to do    that.    Sorry about that and    long life Thomas.
I am the last line.

PS If a single paragraph includes more than 8192 bytes the outcome is unpredictable as POSIX only requests the pattern space to hold at least that amount. This being said GNU sed can hold more than that.

GazL · 03-18-2016, 08:39 AM

Interesting to see an alternate implementation.

I learnt quite a lot about 'sed' writing the above, I don't think I've ever used sed for anything other than manipulating individual lines before. To be honest though, it's an unwieldy tool when it comes to anything other than that, so I doubt I'll use it for anything other than manipulating singles line in future either.

Thanks for raising the pattern space limit issue. I wasn't aware of that, or that '\t' wasn't POSIX.

tronayne · 03-18-2016, 09:27 AM

Thanks Didier,

Only one little tiny thing, it's sed -f merelines.sed (at least my version of sed). Woks fine that way, not so hot without the -f.

I do use sed files for things I do over and over with data that will get loaded into data bases or for cleaning up text files. For example, Windows Weenies don't seem to know the difference between a slash and a back slant (put back slants in dates, plays havoc with a DBMS). People seem bent on making sure that what they said actually is really, truly IMPORTANT!!!!!!!!! and don't know what an ellipses is for............ either.

Just little files of sed directives, makes my life easier, and when you've got thousands of lines of stuff to deal with the streaming editor that does everything on one line at a time, no jumping back to the beginning of the file. It's fast.

And AWK is pretty easy to deal with too, got a lot of AWK programs sitting around solving problems.

Thanks again and be well.

Didier Spaier · 03-18-2016, 09:42 AM

Quote:

Originally Posted by GazL

Interesting to see an alternate implementation.

I learnt quite a lot about 'sed' writing the above, I don't think I've ever used sed for anything other than manipulating individual lines before. To be honest though, it's an unwieldy tool when it comes to anything other than that, so I doubt I'll use it for anything other than manipulating singles line in future either.

I can understand that

.

However if some day you have nothing useless or fun to do, check out convtags (link in my signature below) to see something both long and ugly. Just run "sh convtags" to know more.

I think it's possible to write some functions in sed: before calling one (with t[label] or a b[label]), just store the return address somewhere in the hold space, then at the end of the function use that address to go back.

But I was too lazy to rewrite convtags to implement that. Which doen't matter as I doubt anyone ever used it

Didier Spaier · 03-18-2016, 09:46 AM

Thanks Thomas, mistake corrected. Have a good day.