split multi line record into multiple files with awk
Hi I have a large file 'NS0923.csv' with data like the following. There are two records in this multi-record sample.
Code:
E60898,4578910,03/06/09,BEN BOYD RD,61,82,,,127,3,,52000.3046.001,3155,4.00,,PLT,1356,1.00,05/06/09,Y,Y,0551 1. I have come up with the following awk script. Code:
gawk 'BEGIN {FS=OFS=","} Code:
E60898,4578910,03/06/09,BEN BOYD RD,61,82,,,127,3,,52000.3046.001,05/06/09,Y,Y,0551 2. I still have to create a file 'transaction.csv' that should retrieve data from $13 - $15 with the identifying column $1. Required output: Code:
E60898,4057,1.00,CLEAN CAR SHARE SIGN 3. And finally another file 'quantity.csv'. Retrieving data from $16 - $18 with identifier $1. Required output: Code:
E60898,PLT,1356,1.00 |
The basic principle is simple in Perl
Code:
# split, retrieving only specified fields |
ok, so you are not asking a question right? that's how you print your columns in awk. if you want, you can also use a for loop,
Code:
... |
I'm afraid I do. I want to find out how can I produce
Code:
E60898,4057,1.00,CLEAN CAR SHARE SIGN Code:
E60898,PLT,1356,1.00 |
just use the same method of printing as you had done at 1) ? or does the data actually have a newline?
|
I can only use $1 for the first line, how do I use the $1 from the first line for lines 2,3 & 4 say. Then for the next set of records I'll have to do the same. I am hoping that there's a way to do this in awk.
Thanks. |
you save the first $1 into a variable
|
sorry for the confussion.
not all lines start with EXXXX. yes there are multiple lines of the same record. |
well, if you RTFM you can see that you can use
the print directly to a file. Code:
gawk 'BEGIN {FS=OFS=","} |
Okay, let me try this again - It's a multiline file.
record1,foo,hello,world aa,bb,cc xx,yy,zz record2,bar,hello,world dd,ee,ff uu,vv,ww I want a way to output this. record1,aa,bb,cc record1,xx,yy,zz record2,dd,ee,ff record2,uu,vv,ww |
Code:
$ awk -F"," '/record/{s=$1;next}{print s,$0}' OFS="," file |
All times are GMT -5. The time now is 10:55 PM. |