AWK - issues pattern matching
hi guys, appreciate any ideas on this scenario below.
I got a file with has a pattern like the one below. It has a unique value on the 5th column. I wanted to get all the rows with has same value with the column, and save it to a file, using the 5th column value as its filename. Here's what I have tried but doesn't work as expected: Code:
#!/bin/bash book2.csv - has the raw input for processing Sample data and expected output: Quote:
|
Put some of your field separators in ...
|
If the data in field $5 is 100% reliable, and guaranteed to be grouped, you could use it for the name of the file:
Code:
cat book2.csv \ Edit: removed redundant line. |
Hm, wouldn't just this be enough
Code:
awk 'out=$5".txt"{print>out}' book2.csv |
Print unconditionally into filenames constructed from $5
Code:
awk '{ print > ($5 ".txt") }' book2.csv Code:
awk 'NR==FNR { f[$1]; next } ($5 in f) { print > ($5 ".txt") }' input.txt book2.csv |
Quote:
It's actually, value1,,value3 or ,,value3,,value5 basically empty commas is equivalent to a single field with a blank value. My bad forgot to include commas on my original post. Thanks for the heads up. |
Quote:
It works fine I ended up with this one since i forgot to include commas in my post. Quote:
How do we tell AWK not to open the URLs but just process the file? Cheers! |
No problem. Though do look at MadeInGermany's second example in #5.
Quote:
Edit: Or else $5 needs to be validated. Again see the second example in #5 above. My example came with the caveat about $5 containing only numbers. |
Quote:
It works fine but ended when the it found URL on the data, it says awk cannot open http://curl.haxx.se (no such file or directory) How do we bypass AWK to avoid this error? The second command you give, doesn't output any data. Thanks for your help. |
If the fifth field does not have numbers but URLs instead, the slashes are going to give you trouble in the file names. You'll need to think of some other naming convention. The slash is not allowed in directory or file names. The gsub() function will be needed to escape them in your AWK script.
|
Quote:
|
Please show one of the offending lines plus the exact script you are trying.
|
Quote:
Script: Quote:
Quote:
Quote:
Quote:
|
If lines are composed of variable number of fields, but the key number is always the last field, try $NF instead of $5 (or $(NF-1) for second to last field and so on)
Code:
awk -F, '{print>($NF".txt")}' book2.csv |
In addition to shruggy's suggestion you might show this output:
Code:
grep curl.haxx.se books2.csv |
All times are GMT -5. The time now is 10:48 PM. |