LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 02-23-2009, 12:22 PM   #1
guest
Member
 
Registered: May 2003
Distribution: CentOS 5 64 bit
Posts: 255

Rep: Reputation: 30
Bash: How to remove rows in CSVs where Price column is less than 30


I have tons of csv/txt files (~5gb & 300 files). Is there a way to edit all the files where if the price is less than 30, then it will simply remove the rows?

Say I have this in my csv:
Code:
ID,NAME,PRICE
1,BLUE,25
2,GREEN,36
3,TEAL,37
4,BLACK,0
5,PURPLE,10
After the command, the csv will have this:
Code:
ID,NAME,PRICE
2,GREEN,36
3,TEAL,37
Thanks in advance
 
Old 02-23-2009, 12:29 PM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Code:
awk -F, '$NF >= 30' file.cvs
 
Old 02-23-2009, 12:35 PM   #3
guest
Member
 
Registered: May 2003
Distribution: CentOS 5 64 bit
Posts: 255

Original Poster
Rep: Reputation: 30
The problem is the actual file looks like this:
Code:
ID,NAME,PRICE,SHIPPING
1,BLUE,25,5.99
2,GREEN,36,2.99
3,TEAL,37,6.99
4,BLACK,0,1.99
5,PURPLE,10,9.99
so the desired output is like this:
Code:
ID,NAME,PRICE,SHIPPING
2,GREEN,36,2.99
3,TEAL,37,6.99
In your code, I don't see how it identifies the PRICE column explicitly? Thanks though colucix!!
 
Old 02-23-2009, 12:52 PM   #4
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Code:
awk -F, '$(NF-1) >= 30' file.csv
The -F, option tells awk to use a comma as Field Separator. The field is $(NF-1) if you want to parse the column which comes before the last one. That is I count the columns from the last one backward. If you want to count from the first column onward, just do
Code:
awk -F, '$3 >= 30' file.csv
 
Old 02-23-2009, 12:58 PM   #5
guest
Member
 
Registered: May 2003
Distribution: CentOS 5 64 bit
Posts: 255

Original Poster
Rep: Reputation: 30
This is genius.. because it's so simple! thanks again colucix
 
Old 04-11-2009, 03:06 PM   #6
guest
Member
 
Registered: May 2003
Distribution: CentOS 5 64 bit
Posts: 255

Original Poster
Rep: Reputation: 30
Is there a way to execute the command: awk -F, '$3 >= 30' file.csv

and have the output saved as file.csv as well?

I have many files and wish to edit the existing file instead of making new ones.

Code:
awk -F, '$3 >= 30' file.csv > file.csv
doesn't seem to work

Last edited by guest; 04-11-2009 at 03:31 PM.
 
Old 04-11-2009, 04:39 PM   #7
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Nope. Awk cannot edit files in place. You have to use sed with the -i option, but since sed cannot test numeric expressions, you have to find a regular expression matching the price less (or more) than 30. You can try with something like:
Code:
sed -i.bck '/.*,.*,[0-9],.*/d;/.*,.*,[0-2][0-9],.*/d' file.csv
that is you explicitly put all the commas separated fields. The third field will be either a one-digit number between 0 and 9 or a two digit number between 00 and 29. In this way lines with price less than 30 will be deleted.

The -i.bck does a backup copy of the original file with the .bck extension added. This is for safety. Once you've checked the result you can easily remove the backup files by rm *.bck.
 
Old 04-11-2009, 05:19 PM   #8
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: Mint, Armbian, NetBSD, Puppy, Raspbian
Posts: 3,515

Rep: Reputation: 239Reputation: 239Reputation: 239
yes,
when you do something like,

Code:
command infile > outfile
the output file is created and/or truncated initially.
hence, if it's the same name, you destroy the file before you open it.

still never mind eh?

one to remember.
 
Old 04-11-2009, 06:15 PM   #9
guest
Member
 
Registered: May 2003
Distribution: CentOS 5 64 bit
Posts: 255

Original Poster
Rep: Reputation: 30
For now, this is what I did:

Code:
for file in *
do
awk -F, '$3 >= 30' "$file" > edited."$file"
done
But what's an easy way to rename edited.filename.csv back to filename.csv?
 
Old 04-11-2009, 06:20 PM   #10
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: Mint, Armbian, NetBSD, Puppy, Raspbian
Posts: 3,515

Rep: Reputation: 239Reputation: 239Reputation: 239
[QUOTE=guest;3505878]For now, this is what I did:

Code:
for file in *
do
mv $file $file.old
awk -F, '$3 >= 30' "$file.old" > "$file"
done
rm *.old

laterally?
 
Old 04-11-2009, 06:44 PM   #11
guest
Member
 
Registered: May 2003
Distribution: CentOS 5 64 bit
Posts: 255

Original Poster
Rep: Reputation: 30
That works I did something similar.. just renaming operations after the awk command
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to convert 1 column into several rows in Linux? markraem Linux - Software 9 03-30-2010 11:24 AM
how to remove 72 column limit in gfortran mkrems Programming 1 06-13-2008 06:47 AM
Column into rows bharatbsharma Programming 1 10-25-2007 02:23 AM
Parsing rows and column data from a file using perl dav_y2k Programming 1 10-08-2006 11:57 AM
"Price Paid" column in the HCL pages slackist LQ Suggestions & Feedback 0 07-02-2005 09:38 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:00 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration