split comma and square brackets in csv file
Hi,
I have csv file with the fields in the format hello,world,hds,[84,198,90],hdi,[89,92,200] The elements within square brackets could vary from 0 to N(0..100). How do I split the file using awk? I have tried different variations, but I cannot get it working. `awk -F '[][,]' '{print "{ a:\""$1"\", b:\""$2"\", v:\""$3"\",f:"$4", j:"$5", k:"$6"}"} Before : hello,world,hds,[84,198,90],hdi,[89,92,200] After is JSON object { a: "hello" b: "world" v: "hds" f: [84,198,90] j: "hdi" k: [89,92,200] } Any help is appreciated, Thanks, |
before and after please.
|
Have:
Code:
hello,world,hds,[84,198,90],hdi,[89,92,200] Code:
{ Code:
echo "hello,world,hds,[84,198,90],hdi,[89,92,200]" \ |
^ thanks for the translation dannyb, the brackets can have between 0 or 100 items...
|
Thanks dannyb, but elements within square brackets could vary. I don't care much for formatting, just that it has to be valid JSON format.
|
Try this:
Code:
[rkn] ~ $ cat try.awk The expression consists of 6 subexpressions contained in parentheses, with a comma between each subexp. Subexpressions 1, 2, 3, and 5 just match any number of characters that are not a comma. Subexps 2 and 4 match a literal '[' followed by any number of characters that are not a ']' and then a literal ']'. The match() function stores in elements of array aa the characters that match each parenthesized subexpression. |
can you try this ..
# Input file Code:
$ cat curosity.txt Code:
$ cat curosity.sh Code:
$ ./curosity.sh |
Input file:
Code:
hello,world,hds,[84,198,90],hdi,[89,92,200] Code:
sed -e 's/\([^0-9],\)/&~/g' $InFile \ |
As the formatting is trivial once you have the correct fields, if you are using gawk 4+, you can use:
Code:
awk 'BEGIN{FPAT = "[^,]+|\\[[^]]+\\]"}{...}' file |
Here's a version that doesn't require any special extensions:
Code:
awk -F'[,][[]|[]][,]?' -v OFS='\n' \ |
Quote:
Daniel B. Martin |
Ah, so there are. But yeah, that's just a problem of formatting the print command. Easy enough to add them in.
It might be a good idea to switch to using printf for this, too. |
Just for completeness and slight alternative:
Code:
awk 'BEGIN{FPAT = "[^,]+|\\[[^]]+\\]";split("abvfjk",a,"")}{l="{\n";for(i=1;i<=NF;i++)l=l a[i]":"$i RS;print l "}"}' file |
Quote:
Daniel B. Martin |
Works just fine for me using your example :)
Code:
grail@pilgrim:~$ awk 'BEGIN{FPAT = "[^,]+|\\[[^]]+\\]";split("abvfjk",a,"")}{l="{\n";for(i=1;i<=NF;i++)l=l a[i]":"$i RS;print l "}"}' f2 |
All times are GMT -5. The time now is 06:02 PM. |