LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 08-03-2012, 08:24 AM   #1
ingram87
LQ Newbie
 
Registered: Aug 2011
Posts: 9

Rep: Reputation: Disabled
How to script csv editing? Remove rows from csv file that do not contain certain text


I have a csv file with a column for states. How do I write a script to remove all rows that are not "NY" in that column? I'm not even sure where to begin, as I don't know of any tools to edit csv files from the command line.

If anyone knows where to point me, or knows a short script that does this, it would be much appreciated.
 
Old 08-03-2012, 08:45 AM   #2
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
The tools do not care whether it is a csv file. Are you familiar with AWK and SED? If not, read all about them here:
http://www.grymoire.com/Unix/

Also, please post a sample of what the file looks like.
 
Old 08-03-2012, 08:50 AM   #3
MensaWater
LQ Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, CoreOS, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 7,831
Blog Entries: 15

Rep: Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669
CSV = comma separated values and is simply a text file using commas as delimiters.

Your file should look something like:
123 State Street,Albany,NY,10005
247 City Road,Atlanta,GA,30008
4957 County Avenue,Durham,NC,50333
2932 Parish Lane,New Orleans,LA,79999
4193 Borough Circle,New York,NY,10029

Since you know there is a comma before and after the value you're interested in the simplest way would be to simply grep for the pattern:

grep ,NY, csvfile

For above example that would output:
123 State Street,Albany,NY,10005
4193 Borough Circle,New York,NY,10029

You can redirect the above output into a new file:

grep ,NY, csvfile >newcsvfile

Note that you'd want to simply "cat csvfile" first to make sure the fields are as expected (e.g. it is not padded with extra spaces somehow such as ",NY ,")
 
Old 08-03-2012, 08:55 AM   #4
ingram87
LQ Newbie
 
Registered: Aug 2011
Posts: 9

Original Poster
Rep: Reputation: Disabled
I'm not familiar with those tools, will look into them. Here is a sample of the csv file:
Code:
id,name,state,
50,Jamie,AL,
51,Jenifer,GA,
52,George,NY,
53,Corey,NY,
54,Leslie,TN,
55,David,NY,
 
Old 08-03-2012, 09:07 AM   #5
ingram87
LQ Newbie
 
Registered: Aug 2011
Posts: 9

Original Poster
Rep: Reputation: Disabled
Thankyou MensaWater. This is what I needed
 
2 members found this post helpful.
Old 08-03-2012, 09:42 AM   #6
ingram87
LQ Newbie
 
Registered: Aug 2011
Posts: 9

Original Poster
Rep: Reputation: Disabled
Instead of removing the lines that do not contain NY, how could I also send those to a new file so I can see which ones were removed?
 
Old 08-03-2012, 09:51 AM   #7
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Code:
grep ", NY," filename > newfilename
Sends all lines containing the quoted phrase to "newfilename"

Note that this is not totally robust---i.e. it assumes that that pattern does not occur anywhere else. With AWK, you can do operations based on the specific field.

In addition to the references I linked, also go here: http://tldp.org
Get the Bash Guide for Beginners, and---later---the Advanced BASH Scripting Guide
 
Old 08-03-2012, 12:23 PM   #8
MensaWater
LQ Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, CoreOS, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 7,831
Blog Entries: 15

Rep: Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669
Quote:
Originally Posted by ingram87 View Post
Instead of removing the lines that do not contain NY, how could I also send those to a new file so I can see which ones were removed?
When you use grep it gives you the pattern for which you're searching. To have it instead give you everything EXCEPT the pattern you use the -v option.

Code:
grep -v ",NY," filename > noNYentries
 
Old 08-03-2012, 12:34 PM   #9
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,313

Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
awk solution:
Code:
awk -F , '$3 == "NY" {print $0}' ingram87.csv
 
2 members found this post helpful.
Old 08-03-2012, 12:45 PM   #10
MensaWater
LQ Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, CoreOS, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 7,831
Blog Entries: 15

Rep: Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669
Quote:
Originally Posted by schneidz View Post
awk solution:
Code:
awk -F , '$3 == "NY" {print $0}' ingram87.csv
The above would be to get the records with NY.

For getting the ones WITHOUT NY:

Code:
awk -F , '$3 != "NY" {print $0}' ingram87.csv
And just as in the grep examples you can redirect to new files with the "> newfilename" at end of the line.
 
  


Reply

Tags
csv


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
consolidating rows from csv file noony123 Linux - Newbie 3 12-14-2011 07:47 AM
how to convert a simplte text file into csv using awk script certteam Linux - General 1 09-15-2010 12:23 AM
How to ignore rows with a specific character in a csv file ziggy25 Linux - Newbie 8 03-13-2010 12:38 PM
convert columns to rows (tab separated file to csv) doug23 Programming 16 08-16-2009 09:14 PM
bash script - remove header row from csv file pljvaldez Programming 5 08-30-2006 11:05 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 09:15 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration