LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 09-29-2007, 07:08 PM   #1
donnied
Member
 
Registered: Oct 2006
Distribution: Debian x64
Posts: 198

Rep: Reputation: 30
sed add , before .[0-9]


I would like to use sed to replace 1979 with ,1979

I was thinking something along the lines of:

sed 's/*.^[0-9]/,/' mycoolfile


but that's not quite right. How do I specify replacing the space before something?

Last edited by donnied; 09-29-2007 at 07:09 PM.
 
Old 09-29-2007, 09:05 PM   #2
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
Code:
sed 's/1979/,1979/' yourcoolfile
This is for your first line. I don't understand the last question. If you want to replace a space before any 4 digit number,
Code:
$ cat sample
abcd 1234 2007 3456 11111
abcde 111 111 22 33 1267

jschiwal@hpmedia ~
$ sed 's/ \([[:digit:]]\{4\}[^[:digit:]]\)/,\1/g' sample
abcd,1234 2007,3456 11111
abcde 111 111 22 33 1267
Guess what, it's wrong. This rule doesn't work if the 4 digit number is at the end of a line.My bad, but it illustrates the importance of testing sed scripts before using them.
Code:
$ cat sample
abcd 1234 2007 3456 11111
abcde 111 111 22 33 1267

jschiwal@hpmedia ~
$ sed 's/ \([[:digit:]]\{4\}[^[:digit:]]\)/,\1/g;s/ \([[:digit:]]\{4\}\)$/,\1/'
 sample
abcd,1234 2007,3456 11111
abcde 111 111 22 33,1267
Here I added a second rule. You can have more than one rule on the same line by seperating them with a semicolon. For more complicated sed instructions, create a separate file.

You didn't make clear whether the number needs to be exactly 4 digits. You only gave one example that resembled a year. It is important using sed, awk or any regular expression to be as precise as you need to be. Otherwise you will either miss some replacements like my first attempt, or have a false positive match with could cause a replacement you don't want.

Last edited by jschiwal; 09-29-2007 at 09:24 PM.
 
Old 09-30-2007, 02:46 AM   #3
angrybanana
Member
 
Registered: Oct 2003
Distribution: Archlinux
Posts: 147

Rep: Reputation: 21
Quote:
Originally Posted by jschiwal View Post
Code:
$ cat sample
abcd 1234 2007 3456 11111
abcde 111 111 22 33 1267

jschiwal@hpmedia ~
$ sed 's/ \([[:digit:]]\{4\}[^[:digit:]]\)/,\1/g;s/ \([[:digit:]]\{4\}\)$/,\1/'
 sample
abcd,1234 2007,3456 11111
abcde 111 111 22 33,1267
shouldn't 2007 have a ',' before it? running the sed expression twice should fix this issue.

Last edited by angrybanana; 09-30-2007 at 03:06 AM.
 
Old 09-30-2007, 10:20 AM   #4
donnied
Member
 
Registered: Oct 2006
Distribution: Debian x64
Posts: 198

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by jschiwal View Post
Code:
sed 's/1979/,1979/' yourcoolfile
If you want to replace a space before any 4 digit number,
Thank you. I thought of that later. However, you're right. My true intent is to place a comma before a four digit string of numbers.


Quote:
Originally Posted by jschiwal View Post
Code:
$ sed 's/ \([[:digit:]]\{4\}[^[:digit:]]\)/,\1/g;s/ \([[:digit:]]\{4\}\)$/,\1/'
 sample
I hadn't seen posix character classes until last night. It looks like the way to go. I was able to identify four digit years with [0000-9999]; however, the replacement part didn't work out well. I can specify

Code:
sed 's/ .^[0000-9999]/ ,but how do keep the same numbers when I replace?
Quote:
Originally Posted by jschiwal View Post
You didn't make clear whether the number needs to be exactly 4 digits. You only gave one example that resembled a year. It is important using sed, awk or any regular expression to be as precise as you need to be. Otherwise you will either miss some replacements like my first attempt, or have a false positive match with could cause a replacement you don't want.
Thank you again. This was helpful. I'll try to deconstruct it.

On a side note:
What if I wanted the comma before any size string of numbers? Is there a way without :digit: or setting variables?
how do I specify a line break? (I want to replace a line break and three tabs with the last entry on the line that did not consist of tabs.)

Some Guy- wrote this book 2007
{tabx3} wrote this book 2006
{tabx3} wrote another book 2005

then becomes

Some Guy, book a, 2007
Some Guy, book b, 2006
Some Guy, book c, 2005

Thanks again. I've been reading through O'reilly "Learning Sed and Awk", man pages, and some online articles, but the answers aren't always obvious (to me). Is there another resource that would be worth looking at?
 
Old 09-30-2007, 07:32 PM   #5
angrybanana
Member
 
Registered: Oct 2003
Distribution: Archlinux
Posts: 147

Rep: Reputation: 21
Quote:
Originally Posted by donnied View Post
Code:
sed 's/ .^[0000-9999]/ ,but how do keep the same numbers when I replace?
You keep the number by using groups. enclose whatever you want to keep in '()' then call it back using '\1', multiple groups will have multiple numbers \1 \2 \3..etc.. ex.
Code:
$ echo "foobar"|sed 's/.*\(oo\).*/m\1/'
moo
the 'oo' = group #1, however the whole match is replaced with 'm'+group1. hence 'moo'

Here's a more relevant example.
Code:
echo "book 1979"|sed 's/ \([0-9]\{4\}\)/, \1/'
book, 1979
This matches *space* then 0-9 (4 times). Only the 4 numbers are put into group one. The whole match (space + number) is replaced with ', '+group 1

Hope that makes sense..

Quote:
Originally Posted by donnied View Post
What if I wanted the comma before any size string of numbers? Is there a way without :digit: or setting variables?
how do I specify a line break? (I want to replace a line break and three tabs with the last entry on the line that did not consist of tabs.)
1. [[:digit:]] == [0-9]. Not using either of those two will just be difficult. [0-9]+ will match 1 or more repetitions of [0-9]
2. ^ matches the start of a line $ matches the end of a line. '^some guy' matches lines that start with 'some guy'.

I'm not too good with awk/sed, but I'll try to give you an awk solution for your example in a bit...if i figure it out

Quote:
Originally Posted by donnied View Post
Thanks again. I've been reading through O'reilly "Learning Sed and Awk", man pages, and some online articles, but the answers aren't always obvious (to me). Is there another resource that would be worth looking at?
sed = http://sed.sf.net/sedfaq.html | http://xrl.us/sedintro#uh-0 | http://xrl.us/sedstd | http://www.gnu.org/software/sed/manual/
awk = http://www.gnu.org/software/gawk/manual/ | http://catonmat.net/download/awk.cheat.sheet.txt

those are the topic's at irc.freenode.net #awk #sed channels. Which is another great source of info if you're ever stuck trying to figure out something.

Edit:
woohoo! I did it.
Code:
$ cat sample
Some Guy- wrote this book 2007
                        wrote this book 2006
                        wrote another book 2005
other guy- wrote this book 2008
                        book b 2009

$ awk -F'- ' 'BEGIN {OFS="- "}
{if (!/^\t\t\t/) name=$1;
else {sub("^\t\t\t", "", $0);$2=$0;$1=name}}
{gsub(" [0-9][0-9][0-9][0-9]", ",&",$2);print $0}' "sample"

Some Guy- wrote this book, 2007
Some Guy- wrote this book, 2006
Some Guy- wrote another book, 2005
other guy- wrote this book, 2008
other guy- book b, 2009

Last edited by angrybanana; 09-30-2007 at 08:33 PM.
 
Old 09-30-2007, 08:27 PM   #6
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by donnied View Post
My true intent is to place a comma before a four digit string of numbers.
assuming this is really what you want. data tested is from jschiwal's post.
Code:
awk '{
        for(i=1;i<=NF;i++){
         if( $i+0 && length($i) == 4  ) {
           $i = ","$i           
         }
         printf "%s " ,$i
         
        }
        printf "\n"
}
output:
Code:
# ./test.sh
abcd ,1234 ,2007 ,3456 11111
abcde 111 111 22 33 ,1267
Quote:
I've been reading through O'reilly "Learning Sed and Awk", man pages, and some online articles, but the answers aren't always obvious (to me). Is there another resource that would be worth looking at?
are you looking for the answers in that book? or are you learning how to get to the answers?
 
Old 10-01-2007, 06:11 AM   #7
donnied
Member
 
Registered: Oct 2006
Distribution: Debian x64
Posts: 198

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by angrybanana View Post
You keep the number by using groups. enclose whatever you want to keep in '()' then call it back using '\1', multiple groups will have multiple numbers \1 \2 \3..etc.. ex.

I'm not too good with awk/sed, but I'll try to give you an awk solution for your example in a bit...if i figure it out

sed = http://sed.sf.net/sedfaq.html | http://xrl.us/sedintro#uh-0 | http://xrl.us/sedstd | http://www.gnu.org/software/sed/manual/
awk = http://www.gnu.org/software/gawk/manual/ | http://catonmat.net/download/awk.cheat.sheet.txt
Wow! Thank you for the information. You explained really well. That was an amazing amount of work you did. I appreciate it and it has helped my understanding.
 
Old 10-02-2007, 02:26 AM   #8
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
I'll have to look at the example I posted to see why the 2007 isn't handled. I think that the "[^[:digit:]]" goobles up the space before the next number. Adding the first command again solves the problem.
Code:
sed 's/ \([0-9]\{4\}[^[0-9]\)/,\1/g;s/ \([0-9]\{4\}[^[0-9]\)/,\1/g;s/ \([[:digit:]]\{4\}\)$/,\1/' sample
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
bash script with grep and sed: sed getting filenames from grep odysseus.lost Programming 1 07-17-2006 11:36 AM
[sed] "Advanced" sed question(s) G00fy Programming 2 03-20-2006 12:34 AM
Using sed to add carriage retuns and line numbers. Optimistic Programming 11 04-08-2005 01:13 AM
sed and escaping & in something like: echo $y | sed 's/&/_/g' prx Programming 7 02-03-2005 11:00 PM
Insert character into a line with sed? & variables in sed? jago25_98 Programming 5 03-11-2004 06:12 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:18 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration