[SOLVED] Joining Three Lines of Text into One Line with Delimiters Between the Original Lines
SlackwareThis Forum is for the discussion of Slackware Linux.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541
Rep:
Joining Three Lines of Text into One Line with Delimiters Between the Original Lines
I have a file that contains
Code:
First_name Last_name
House_Number Street_Name Street_Type
City State Zip
I want the output to be a CSV file with tab delimiters between the three fields.
The names can include Jr, II, Sr, maybe a middle name too.
The Addresses can be 12345 Nicholson Hill Road or 12345 Cedar (or mine, 12345 North US Hwy 23.
The City State and Zip are just that.
There is a blank line in file between each (I can simply remove those.
I want the three to be joined with a tab character as a delimiter.
I've been fiddling with paste, can't quite get that to work, I've been fiddling with sed, no joy there either. I'm just getting ready to write in C (and this is a one-off job).
I know I've done this before, just gotten too danged old to remember how.
#!/usr/bin/perl
use warnings;
my $line1;
my $line2;
my $line3;
my $outline;
open(in_file,"<./list.txt") or die "no file found";
open(out_file,">./list_perl.txt") or die "cannot create file";
while (my $line1=<in_file>) {
$line2=<in_file>;
$line3=<in_file>;
$line1 =~ s/\R//g;
$line2 =~ s/\R//g;
$line3 =~ s/\R//g;
$outline= $line1 . "\t" . $line2 . "\t" . $line3 . "\n";
print out_file $outline;
}
close(in_file);
close(out_file);
input
Code:
First_name Last_name
House_Number Street_Name Street_Type
City State Zip
First_name Last_name
House_Number Street_Name Street_Type
City State Zip
First_name Last_name
House_Number Street_Name Street_Type
City State Zip
output :
Code:
First_name Last_name House_Number Street_Name Street_Type City State Zip
First_name Last_name House_Number Street_Name Street_Type City State Zip
First_name Last_name House_Number Street_Name Street_Type City State Zip
I like perl better because it is usually easier to read and much fast on large text file handling.
#!/usr/bin/env python
import fileinput
import itertools
field_width = 3
field_indexes = (i for i in itertools.cycle(range(field_width)))
fields = []
for index, line in itertools.izip(field_indexes, fileinput.input()):
fields.append(line.strip())
if index == field_width - 1:
print '\t'.join(fields)
fields = []
I like perl better because it is usually easier to read and much fast on large text file handling.
When I used Perl in my day job, it felt as if the Devil was breaking wind into my face with almost every line that I read.
A large subset of the coding world love Perl. There's nothing wrong with that (and I've seen some pretty amazing stuff written in Perl).
I'd just rather not, and probably because the "There's more than one way to do it" is a horrible thing to inflict upon the reader of your code for anything remotely complicated.
(Please don't take the above as a comment about the code that you posted. I just hate Perl.)
$ cat test.txt
Joe Bloggs
5 Somewhere street, Somewhere.
Some Zip
Mary Smith
27 Other Street, Otherplace.
Another Zip
Note that the last line in the input file may be a blank line or not ( does not matter )...
Here's the output:
Code:
$ gawk 'BEGIN{ FS = "\n" ; RS = "" }{ print $1 "\t" $2 "\t" $3 }' test.txt
Joe Bloggs 5 Somewhere street, Somewhere. Some Zip
Mary Smith 27 Other Street, Otherplace. Another Zip
I feel like they missed a golden opportunity to order their names so their initials could come out as AWK instead of alphabetically.
Alfred Aho
Peter Weinberger
Brian Kernighan
those are the original designers of the awk parsing language. i guess book publishers automatically alphabetize multi-authors (or maybe they are just not knowledgeable about it) ?
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541
Original Poster
Rep:
Quote:
Originally Posted by kjhambrick
tronayne --
I know you've solved this one but being an awk junkie, myself, I had to send you another one <G>
Don't delete the Blank Lines !!!
Hey, that's even slicker -- was looking through The AWK Programming Language this morning, hadn't noticed that (got busy with doing other things and just getting back to it).
BTW, you can go to Brian Kernighan's web site, http://www.cs.princeton.edu/~bwk/, and download the source for a Linux version of AWK (New AWK, the one from the book, with updates and fixes). Sometimes that page is a little iffy, sometimes it's not (if you can't get to it, I'll e-mail to you if you want).
Get it, unzip it (be careful, create a directory, cd into it, then unpack it; you get the source and all the examples from the book. Type make, wait a while, copy a.out to /usr/local/bin as nawk.
I use it instead of gawk. It doesn't have embedded GNU "features," and it works just fine.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.