LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-05-2008, 09:38 AM   #1
powah
Member
 
Registered: Mar 2005
Distribution: FC, Gentoo
Posts: 276

Rep: Reputation: 30
remove the extra numeric field in a text file


How to remove the extra numeric field in a text file?
e.g. for the following text file, I want to remove $2,299.11 from the
first line, $2,292.86 from the second line, $2,170.08 from the fourth
line and $2,286.08 from the last line.

Oct. 01, 2007 CITY OF
VANCOUVER $456.48 $2,299.11
Property taxes
Oct. 02, 2007 TD GEN INS
$66.25 $2,292.86 Car insurance
Oct. 09, 2007 BELAIR HABITAT. BELAIR INS/ASS.
$40.95 Home insurance
Oct. 09, 2007 ENBRIDGE ENBRIDGE
$45.91 $2,170.08 Home heating
Nov. 05, 2007 CANADIAN TIRE #
$22.74 Home general maintenance
Dec. 10, 2007 CANADIAN TIRE #
$44.21 Home general maintenance
Dec. 19, 2007 CANADIAN TIRE #
$71.48 $2,286.08 Home general maintenance
 
Old 01-05-2008, 10:39 AM   #2
Telemachos
Member
 
Registered: May 2007
Distribution: Debian
Posts: 754

Rep: Reputation: 60
Here is a small Perl script that seems to work (at least on the data you gave). At the moment, it would just print the changed version to your screen, but it should be easy enough to pipe that to a new version (or edit the script so that it does save the changed version). I just wanted you to be able to test it safely before saving anything or changing your data. I hope that this helps:
Code:
#!/usr/bin/perl
use strict;
use warnings;

while (<>) {
    s/(\$\d+,?\d+\.\d{2})\s+(\$\d+,?\d+\.\d{2})/$1/;
    print;
}
Save this as "delete_numbers" and then run it with, say,
Code:
perl delete_numbers file_to_change
That way you can check if it works across a whole real file before deciding what to do. Note that this will only work if the two numbers are next to one another, separated by only spaces. If the file gets more complicated (eg, with words between the two numbers, or the numbers on separate lines), this wouldn't work.

Last edited by Telemachos; 01-05-2008 at 10:48 AM.
 
Old 01-05-2008, 12:37 PM   #3
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Life would not be complete without a solution using SED....

Code:
sed 's/\$[0-9]\+,\?[0-9]\+\.[0-9]\{2\}//2' filename > newfilename
Disclosure: I looked at the PERL solution, and also had to learn that the "+" has to be escaped in SED.

EDIT: fixed omitted "\"

Last edited by pixellany; 01-06-2008 at 01:12 PM. Reason: boo-boo
 
Old 01-06-2008, 02:54 AM   #4
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
in bash, tested only for your sample file.
Code:
while read line
do
    for items in $line
    do
        case $items in 
        "$"*","* ) line="${line/$items/}";;        
        esac 
    done
    echo $line;
done < "file"
 
Old 01-06-2008, 10:32 AM   #5
yawe_frek
Member
 
Registered: Sep 2005
Distribution: feather 0.72-usb, DSL,CentOS,Ubuntu, Redhat 9
Posts: 144

Rep: Reputation: 15
hi pixellany,

for people like me still learning sed. could you kindly explain

sed 's/$[0-9]\+,\?[0-9]\+\.[0-9]\{2\}//2' filename > newfilename

Thanks
 
Old 01-06-2008, 11:42 AM   #6
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Quote:
Originally Posted by yawe_frek View Post
hi pixellany,

for people like me still learning sed. could you kindly explain

sed 's/$[0-9]\+,\?[0-9]\+\.[0-9]\{2\}//2' filename > newfilename

Thanks
The basic syntax for sed s --in this context--is:
sed 's/thingtofind//2' filename > newfilename
This means "using filename, find thingtofind and replace the 2nd occurence with nothing. Write the result to newfilename"


Now to translate "thingtofind":

EEK!!!! I somehow posted the wrong code: I should be:

Quote:
sed 's/\$[0-9]\+,\?[0-9]\+\.[0-9]\{2\}//2' filename > newfilename
That first "\" makes all the difference.....

Translation (code in bold):
literal "$" \$
any number, minimum of one occurrence [0-9]\+
an optional comma ,\? (This means there can be a comma, but not any other character.)
any number, minimum of one occurrence [0-9]\+
literal "." \.
any number, exactly two occurrences [0-9]\{2\}

One of the big tricks with something like this is to keep the regular expression from being greedy---ie matching more than was intended.

My favorite SED tutorial here: http://www.grymoire.com/Unix
 
Old 01-06-2008, 12:27 PM   #7
yawe_frek
Member
 
Registered: Sep 2005
Distribution: feather 0.72-usb, DSL,CentOS,Ubuntu, Redhat 9
Posts: 144

Rep: Reputation: 15
Thnaks so much for taking out time to explain this to me i am really glad. Less i forget kindly send me the special characters that need to be escaped.

this are the onces i know.

.*^$[]\

THNAKS

Last edited by yawe_frek; 01-06-2008 at 12:34 PM.
 
Old 01-06-2008, 01:09 PM   #8
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
The tutorial I linked has all that stuff--and more.

The really complete reference is the Advanced Bash Scripting Guide (ABS)---at http://tldp.org

Here's one basic definition of when an escape is needed:
"Whenever the meaning of the character needs to be changed from what it normally would be in the context." Escaping can be used to make a character be special---or to stop it from being special.

Examples:
sed 's/?/C/g' filename changes all "?" to "C"--- "?" is not special in SED, unless it is escaped.
sed 's/./C/g' filename changes any character to "C"---thus for a literal ".", we need "\."

Disclosure: The only way I learned this stuff was a mix of reading and trial and error. The power of BASH and Regular expressions unfortunately comes with a lot of stuff that is not intuitive.
 
Old 01-06-2008, 01:46 PM   #9
yawe_frek
Member
 
Registered: Sep 2005
Distribution: feather 0.72-usb, DSL,CentOS,Ubuntu, Redhat 9
Posts: 144

Rep: Reputation: 15
thanks onces again 4 the sites
 
Old 01-06-2008, 07:53 PM   #10
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
@OP
Just FYI, regexp can be a tool for you to learn and use, but not every solution to a problem needs a regexp. Here's a link for you to get started on sed/awk and shell.
 
Old 01-07-2008, 08:29 PM   #11
Telemachos
Member
 
Registered: May 2007
Distribution: Debian
Posts: 754

Rep: Reputation: 60
Quote:
Originally Posted by ghostdog74 View Post
@OP
Just FYI, regexp can be a tool for you to learn and use, but not every solution to a problem needs a regexp. Here's a link for you to get started on sed/awk and shell.
I was curious about the link, but it seems dead. Let us know if it's just a typo, please.
 
Old 01-07-2008, 08:49 PM   #12
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by Telemachos View Post
I was curious about the link, but it seems dead. Let us know if it's just a typo, please.
Its searchable from google if you know how to. then how about this and this.
 
Old 01-08-2008, 10:15 AM   #13
Telemachos
Member
 
Registered: May 2007
Distribution: Debian
Posts: 754

Rep: Reputation: 60
Quote:
Originally Posted by ghostdog74 View Post
Its searchable from google if you know how to. then how about this and this.
Google does searches? Wow, I hadn't known that. Thanks for the tip. You should have checked your link before you posted it. I was simply pointing out the link to nowhere - which you should still edit out of your original post to save other people the trouble of clicking on a dead link. The first link in the post above is to a file that seems corrupt, or at least it fills my browswer with nonsense characters. You really should check before you post a link.
 
Old 01-08-2008, 08:24 PM   #14
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by Telemachos View Post
You should have checked your link before you posted it.
The link is ok from my side. Some areas cannot get through, from what I know.
Quote:
at least it fills my browswer with nonsense characters. You really should check before you post a link.
It depends on whether you know how its done. You can just right click and save to your system.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how not to print the 4th field from a text file with six fields livetoday Red Hat 3 10-02-2007 01:19 PM
Remove text from a file jviola Programming 23 03-21-2007 12:23 PM
how to remove lf from a text file gfem Linux - Software 3 10-19-2006 06:41 PM
remove \r and \n from a text file powah Programming 9 10-02-2006 06:02 AM
How to define Text box as numeric Linh Programming 1 09-24-2003 03:59 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:43 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration