Parsing a tab delimited text file
I'm currently writing a bash shell script that needs to take a tab delimited text file and convert it into a MySQL importable file. I have no experience with gawk (which is what I'm assuming I'd use - if not, please don't hesitate to correct me) - but this is what I'm looking to do:
Original File (the <TAB> is just representative of an actual tab): 1123432<TAB>114 Oceanside Drive<TAB>3324|4432|4432|2234<TAB>11234.jpg Converting to: "1123432","114 Oceanside Drive","3324|4432|4432|2234","11234.jpg" Any help would be GREATLY appreciated - kind of stuck on this one. I'm reading the gawk man page, but it's not really linking in without "sample code". Thanks! |
how about:
cat datafile | sed -e 's/^./"&/' -e 's/.$/&"/' -e 's/large space/\",\"/g' > outfile the large space in the last sed is a literal tab produced by pressing C-v tab at the shell prompt. im sure it can be tidied up as im not a sed expert. |
hello -
try: awk '{ print "\""$1"\",\""$2"\",\""$3"\",\""$4"\"" }' myfile.txt |
Kev82
That just puts quotes around the whole thing. "1123432 114 Oceanside drive 3324|4432|4432|2234 11234.jpg" Sliptwixt that puts quotes around the first 4 fields. "1123432" "114" "Oceanside" "Drive" Use [[:cntrl:]] with sed or grep to look for control characters. If <Tab> is the only control character then something like this would work: sed -e's/[[:cntrl:]]/\"/g' That would replace the three <Tab's> with a " but unfortunately I'm sure there are other return characters, like linefeeds and or carrage returns. I think Chr$(9) is the tab. So you need to make that the field delimiter. |
Quote:
Quote:
|
I assumed the format was consistant and this was a quick-and-dirty one time thing. If it were something I'd have to revisit more than once, I'd opt to script it in Perl or something so I have a little more control over inconsistancies in the datafile and/or some kind of error reporting.
I hope you find the solution that works best for you. |
Thanks All ...........
I really appreciate all of your help!
|
hummm...
kev82 That one works for me. |
yeah, perl works great for that stuff. thats what i use to deliminate and format all my files to be parsed to sql
|
Yeah, my bad.
Thats what I get for posting in Windows 95. I was using cygwin bash shell and I couldn't get the tab to work. I guess when I cut-n-pasted it into a script I kinda forgot the tab. :scratch: |
All times are GMT -5. The time now is 08:22 AM. |