LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 08-02-2006, 01:10 PM   #1
koobi
Member
 
Registered: Jun 2006
Location: Colombo, Sri Lanka
Distribution: Ubuntu
Posts: 103

Rep: Reputation: 15
removing first line with AWK


I want to remove the first line of a CSV using AWK.

I could do this in PHP but then what i'd have to do is read in the CSV, unset the first element of the array and fwrite the data back. but the CSV is HUGE and i don't want to take up all that memory.

so, how would i use awk to remove the first line of a CSV? i suppose we would remove characters till we come accross a CR or LF but how would i do that?
also, i'll have to run this as a cron. any guidance?

thanks for your time
 
Old 08-02-2006, 01:55 PM   #2
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,415

Rep: Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968
Code:
awk '{if (NR!=1) {print}}' filename.csv
 
1 members found this post helpful.
Old 08-02-2006, 05:01 PM   #3
koobi
Member
 
Registered: Jun 2006
Location: Colombo, Sri Lanka
Distribution: Ubuntu
Posts: 103

Original Poster
Rep: Reputation: 15
great, thanks

would you also mind telling me what exactly happens please? or at least refer me to a good site where i can look this up?

does this open filename.csv, remove the first line and leave the file (minus the first line, of course?)

i don't know AWK but it seems like it prints everything but the first line to stdout?

what is NR?
does {print} output to stdout?



thanks for your time
 
Old 08-02-2006, 05:14 PM   #4
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,415

Rep: Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968
NR = Number of Record (Row?... not sure...) which is, by default, seperated on a newline. so if the value of NR is 1 then it is on the first line, so ignore it. otherwise print the whole line.
 
Old 08-02-2006, 05:26 PM   #5
koobi
Member
 
Registered: Jun 2006
Location: Colombo, Sri Lanka
Distribution: Ubuntu
Posts: 103

Original Poster
Rep: Reputation: 15
great, thanks

so then awk accesses filename.csv, reads all its contents to memory, ignores the first line and prints everything else to stdout?


how would i write it back to the same file?

would this work?
Code:
awk '{if (NR!=1) {print}}' filename.csv > filename.csv



ideally, it would read only the first line and delete it along with the CR/LF character at the end so that the rest of the records move up by a row.
can i do that in awk?
 
Old 08-02-2006, 05:40 PM   #6
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,415

Rep: Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968Reputation: 1968
you should never write directly back to the same file. if you want to automate this, write to say, filename.tmp and then rename the file to overwrite the original one once it's completed.
 
Old 08-02-2006, 09:03 PM   #7
sirclif
Member
 
Registered: Sep 2004
Location: south texas
Distribution: fedora core 3,4; gentoo
Posts: 192

Rep: Reputation: 30
Quote:
so then awk accesses filename.csv, reads all its contents to memory, ignores the first line and prints everything else to stdout?
i don't think this is what happens. the file is not read into memory. you can work on files that are larger than your available ram.

also, being able to replace the file by redirecting the stdout stream may depend on the shell, so i can't say this for all command lines. but if you try this in bash, you will get an empty file. when you redirect the stdout to a file, like '$ command > file.txt', the first thing the shell does is create the file 'file.txt'. so by the time gawk tries to read it, it is an empty file.

if you type the command '$ ls > newfile.txt', you will see the file 'newfile.txt' listed in itself.
 
Old 08-03-2006, 02:48 AM   #8
ckin2001
LQ Newbie
 
Registered: Jul 2006
Location: Chambana
Distribution: debian
Posts: 17

Rep: Reputation: 0
Awk reads input line by line rather than the whole file at once. The line

awk 'NR>1' filename.csv

will do the same thing, printing the file contents after record one - but without a comparison before each print. Probably not a big timesaver.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to use awk command to parse fields in a line johnsanty Programming 9 05-25-2006 10:56 PM
how to select a line using awk sharad Linux - Software 5 04-05-2006 10:26 AM
Awk - get a parameter from the command line benjalien Programming 1 01-24-2006 10:06 AM
Awk command-line arguments lowpro2k3 Programming 1 03-28-2005 10:09 PM
Deleting a line with gawk/awk caps_phisto Linux - General 4 11-06-2004 03:31 PM


All times are GMT -5. The time now is 04:53 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration