[SOLVED] AWK / SED - Parsing a CSV file with comma delimiter, and some extra needs.
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
In other words, I'm trying to apply the following rules :
1) All fields separated by a comma (including blank ones) must be simple-quoted, unless this (these) comma(s) belongs to a double-quoted field.
2) All simple-quotes contained within a field must be doubled.
3) If a field is already encapsulated in double-quotes, replace them by simple quotes.
My main problem is that I don't know how to code in AWK or SED the fact to ignore comma(s) within double-quotes when parsing the fields. I'm pretty sure that once this step is done, I can do the rest.
I also thought that it might be a better idea to use a "full" programming language for this, but I didn't want to take this thread off the zero reply list.
for every character in the line
is it a comma?
or is it a double quote?
skip forward until the next double quote, printing each character.
or is it a single quote?
it's neither of the above
print the current character
And by the way, they're "single" quotes, not "simple" quotes.