Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
A significant part of the challenge of the assignment to split city,state into two separate fields is that not all of the lines have fields containing city also include a state. (This is one of the reasons sed may not be the best solution)
pan64 has outlined a "plan of attack" that should work, but again, it's your work to do.
Is this the entire body of input text you need to contend with?
Quote:
Originally Posted by amy441
This is an output of cat abc.csv file command. This is actually a homework given by my teacher and he specifically mentioned to split the column using SED.
Year,Rank,Organization ID,Organization Name,Organization Location,Private Income,Total Assets,Service Expense,Fundraising Expense
2004,1,321092,Salvation Army,"Alexandria, Va.","$1,324,089,000","$3,039,922,000","$2,126,200,000","$123,791,000"
2004,2,321148,American Cancer Society,Atlanta,"$794,000,000","$836,295,000","$610,639,000","$188,150,000"
2004,3,321036,Gifts In Kind International,"Alexandria, Va.","$787,192,199","$790,561,819","$792,432,766","$167,242"
I feel you should consider visiting with your instructor and asking them what the full point of this assignment truly is.
You can edit a file using sed. You can also use tr, and awk. Each tool has benefits and some drawbacks.
Well understood that they told you to only use sed.
A HUGE concern for me is summarized here:
Quote:
,"Alexandria, Va.",
,Atlanta,
,"Alexandria, Va.",
They've been inconsistent twice with their text for this problem.
That middle term is not enclosed in quotations
It also does not contain a state.
If their intention was to have you stretch your mind to properly change the first and third terms solely using sed, that's one thing.
Is the whole point of their assignment to be able to contend with the first and third terms, as well as the second term?
Is there more input text you haven't shown by the way?
Are you supposed to fix the quotations around "Alexandria, VA" so that it becomes "Alexandria", "VA",?
Are you supposed to add quotations around Atlanta?
And as asked by another person, what are you supposed to do about that term missing the state?
Last edited by rtmistler; 06-11-2018 at 10:08 AM.
Reason: Added more Q's
Missing fields are simply consecutive commas in a CSV. It's not actually that hard to do the entire exercise in sed - doing it in one stanza gets interesting - but it is an exercise for the OP.
I've spent way more time in the bowels of csv files than I ever wanted to.
Just to clarify a couple of things:
The fields are separated by commas.
If the field contains a comma, then it is quoted.
So, in the second row, the lack of a comma and state is why that field is not quote
If the OP could save the data from the spreadsheet in a tab-delimited format, that would remove all the delimiting commas and the quotes from the data, which might help things some.
In any case, it's important to differentiate between commas that are delimiters and commas that are data, and why the quotes are there.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.