LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 11-02-2011, 05:03 AM   #1
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Rep: Reputation: Disabled
awk: remove the last character in the file


Dear Experts,

I have a file with multiple lines inside which looks like:
Code:
aaaaaaaaaaaaaaaa,
bbbbbbbbbbbbbbbb,
cccccccccccccccc,
dddddddddddddddd,
eeeeeeeeeeeeeeee,
I want to remove the last comma in the last line. The output file should look like:
Code:
aaaaaaaaaaaaaaaa,
bbbbbbbbbbbbbbbb,
cccccccccccccccc,
dddddddddddddddd,
eeeeeeeeeeeeeeee
I tried
Code:
awk '{gsub(/,$/,"");print}' FILENAME
and
Code:
sed 's#[\]$##' FILENAME
Both of these code remove all the comma in the file, which is not what I pursued. So, How could I just remove the last comma simply by awk?

Thanks a lot!
 
Old 11-02-2011, 05:24 AM   #2
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,604

Rep: Reputation: 446Reputation: 446Reputation: 446Reputation: 446Reputation: 446
Hi,

if the last line is not a blank line, i.e. the line which you want the comma removed then you could try something like:
Code:
sed '$ s/,$//' file
 
1 members found this post helpful.
Old 11-02-2011, 08:32 AM   #3
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
This GNU awk snippet keeps newlines intact, and removes the final comma even if there are empty lines following it:
Code:
gawk 'BEGIN { RS=",[\t\n\v\f\r ]*[\n\r]+" } { printf("%s%s", nl, $0) ; nl=RT } END { sub(/^\,/, "", nl); printf("%s", nl) }'
The idea is to use a record (line) separator consisting of a comma, optional whitespace, and one or more newlines. Using the automatic variable RT provided by GNU awk, we retain the record separators; we only output it just before the next record. When all records have been output, the comma (if any) is stripped from the final record separator, and the final separator is output.

The end result is that the file stays exactly the same, except when there is a final comma followed by (optional whitespace) and at least one newline; then the comma is stripped away.

Note that if there is no newline after the final comma, i.e. the comma is the last character in the file (except for optional spaces and tabs), it is not stripped. If you suspect you may have such files, better use a slightly more complicated variant that handles that case too:
Code:
gawk 'BEGIN { RS=",[\t\n\v\f\r ]*[\n\r]+" } { printf("%s%s", ln, nl); ln = $0; nl = RT } END { if (length(nl) > 0) printf("%s%s", ln, gensub(/^,/, "", "g", nl)); else printf("%s", gensub(/,([\t\v\f ]*)$/, "\\1", "g", ln)) }'
 
1 members found this post helpful.
Old 11-03-2011, 08:26 AM   #4
Reuti
Senior Member
 
Registered: Dec 2004
Location: Marburg, Germany
Distribution: openSUSE 11.4
Posts: 1,319

Rep: Reputation: 252Reputation: 252Reputation: 252
Will you feed this to any other application and need the final LF? Otherwise the head command might work too:
Code:
$ head -c -2 FILENAME
But it will remove the comma plus the final LF.
 
1 members found this post helpful.
Old 11-03-2011, 09:47 AM   #5
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Reuti View Post
Will you feed this to any other application and need the final LF? Otherwise the head command might work too:
Code:
$ head -c -2 FILENAME
But it will remove the comma plus the final LF.
This is really smart!! Thanks!!!
 
Old 11-03-2011, 10:19 AM   #6
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Nominal Animal View Post
This GNU awk snippet keeps newlines intact, and removes the final comma even if there are empty lines following it:
Code:
gawk 'BEGIN { RS=",[\t\n\v\f\r ]*[\n\r]+" } { printf("%s%s", nl, $0) ; nl=RT } END { sub(/^\,/, "", nl); printf("%s", nl) }'
The idea is to use a record (line) separator consisting of a comma, optional whitespace, and one or more newlines. Using the automatic variable RT provided by GNU awk, we retain the record separators; we only output it just before the next record. When all records have been output, the comma (if any) is stripped from the final record separator, and the final separator is output.

The end result is that the file stays exactly the same, except when there is a final comma followed by (optional whitespace) and at least one newline; then the comma is stripped away.

Note that if there is no newline after the final comma, i.e. the comma is the last character in the file (except for optional spaces and tabs), it is not stripped. If you suspect you may have such files, better use a slightly more complicated variant that handles that case too:
Code:
gawk 'BEGIN { RS=",[\t\n\v\f\r ]*[\n\r]+" } { printf("%s%s", ln, nl); ln = $0; nl = RT } END { if (length(nl) > 0) printf("%s%s", ln, gensub(/^,/, "", "g", nl)); else printf("%s", gensub(/,([\t\v\f ]*)$/, "\\1", "g", ln)) }'
I got the point, thanks for the detailed explanation!!! I do not even know the variable RT before. It seems a really powerful application.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] sed/awk : remove section from file vrusu Linux - Newbie 3 10-26-2010 08:49 AM
[SOLVED] Script to remove lines in a file with more than "x" instances of any character ? pissed_budgie Programming 12 10-08-2010 08:16 PM
Bash scripting: parsing a text file character-by-character Completely Clueless Programming 13 08-12-2009 09:07 AM
Remove last character from file/string linuxchump Programming 34 06-08-2009 04:01 AM
Character \ in awk indiancosmonaut Programming 6 06-30-2008 07:57 PM


All times are GMT -5. The time now is 07:44 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration