[SOLVED] sum the third field of csv file ignoring the commas in double quotes
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
My question would be if the 'col2' field could perhaps contains more / less data and hence the column being summed may not be the same, ie. that they all happen to be in the fourth comma
separated field at present, like
My question would be if the 'col2' field could perhaps contains more / less data and hence the column being summed may not be the same, ie. that they all happen to be in the fourth comma
separated field at present, like
I am curious though, why the use of 'next' in your script?
Because I'm not that certain of awk's default behavior, and I often state what to do next when it is unnecessary. I guess 'next' is only required when there are subsidiary statements one wishes to skip.
That feature is a recent addition to gawk, only in version 4 and up. I'm still using 3.1.8, and my system's man gawk doesn't even mention FPAT. But good to know, so when a final slackware 14 is released, I'll be able to use it.
For older versions of awk, assuming the double quotes are well balanced and you're not intersted in the content of the quoted fields, you can simply remove them and split the record by the remaining commas, e.g.
Anyway, awk is not the right tool to parse tricky CSV files (specifically you cannot build a regexp FS to exclude commas inside quotes). The perl or python modules for parsing CSV files are more suitable to accomplish these tasks.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.