LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-05-2016, 09:08 PM   #1
fanoflq
Member
 
Registered: Nov 2015
Posts: 397

Rep: Reputation: Disabled
sed - stream editor and regex [0-9]*


I do not understand why this sed operation did not repeat "123":

$ echo "abc 123" | sed 's/[0-9]*/& &/'
abc 123

I was expecting output of:
abc 123 123

since 123 was matched,
and I have replacement for 123 as /& &/ = 123 123.

Or this
abc 123 abc 123

since "a" in abc is matches [0-9]*.


Obviously I misunderstood has regex matched numbers using pattern [0-9]*.
My understanding is that [0-9]* will match any set of digits starting from zero or more single digit like: 3, 00, 10, 234, a, ab, ...

What did I missed?
Thanks.

Last edited by fanoflq; 02-05-2016 at 10:08 PM.
 
Old 02-05-2016, 09:40 PM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 20,823

Rep: Reputation: 4004Reputation: 4004Reputation: 4004Reputation: 4004Reputation: 4004Reputation: 4004Reputation: 4004Reputation: 4004Reputation: 4004Reputation: 4004Reputation: 4004
Quote:
Originally Posted by fanoflq View Post
My understanding is that [0-9]* will match any set of digits starting from zero or more single digit
When testing regex, use the -n switch on sed, and "p" to print your records. Then you know when you aren't doing what you want. (you're just getting the echo'd data here).

Also when using regex, use the -r switch.
That should get you started.
 
1 members found this post helpful.
Old 02-05-2016, 09:40 PM   #3
AlucardZero
Senior Member
 
Registered: May 2006
Location: USA
Distribution: Debian
Posts: 4,824

Rep: Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615
Not quite. [0-9]* matches zero or more 0-9's. Which matches first at the front of the string.
Code:
~$ echo "abc 123" | sed 's/[0-9]*/fjfg/'
fjfgabc 123
Try:
Code:
$ echo "abc 123" | sed 's/[0-9][0-9][0-9]/& &/'
abc 123 123
 
3 members found this post helpful.
Old 02-05-2016, 10:04 PM   #4
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 732

Rep: Reputation: 75
Hi.
Code:
echo "abc 123" | sed 's/[0-9]\{1,\}/& &/'
abc 123 123
For systems like:
Code:
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.3 (jessie) 
sed (GNU sed) 4.2.2
Best wishes ... cheers, makyo

Last edited by makyo; 02-05-2016 at 10:07 PM.
 
1 members found this post helpful.
Old 02-05-2016, 10:18 PM   #5
fanoflq
Member
 
Registered: Nov 2015
Posts: 397

Original Poster
Rep: Reputation: Disabled
Why do you want to use -n?
From man page:
-n, --quiet, --silent
suppress automatic printing of pattern space


$ echo "abc 123" | sed -n 's/[0-9]*/& &/p'
abc 123 <=== Is this the first pattern match?
$ echo "abc 123" | sed 's/[0-9]*/& &/p'
abc 123
abc 123

Why two different results?
 
Old 02-05-2016, 10:36 PM   #6
fanoflq
Member
 
Registered: Nov 2015
Posts: 397

Original Poster
Rep: Reputation: Disabled
Pattern [0-9]* will be a valid match to anything.
So regex should return these:
a
ab
abc
abc ( there is a space after abc)
abc 1
abc 12
abc 123

b
bc
....

c
c + space
c 1
....
So I really should expect an infinite match.
This is where I not sure about correctness of sed's regex.
 
Old 02-05-2016, 10:41 PM   #7
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,077
Blog Entries: 23

Rep: Reputation: 4063Reputation: 4063Reputation: 4063Reputation: 4063Reputation: 4063Reputation: 4063Reputation: 4063Reputation: 4063Reputation: 4063Reputation: 4063Reputation: 4063
The problem as noted already is that the '*' matches zero or more digits at the beginning, so the output is 'nothing space nothing' (the zero digits that were matched, twice, separated by a space) followed by the rest of the string.

Use '+' instead...

Code:
echo "abc 123" |sed 's/[0-9]\+/& &/'
abc 123 123
Note the escaped '\+' which is not necessary with the -r option ('+' with -r).

Last edited by astrogeek; 02-05-2016 at 10:58 PM. Reason: typos and changed wording for clarity
 
3 members found this post helpful.
Old 03-03-2017, 04:59 PM   #8
fanoflq
Member
 
Registered: Nov 2015
Posts: 397

Original Poster
Rep: Reputation: Disabled
I went back and lookup this post while reading sed again.

Quote:
Originally Posted by astrogeek View Post
The problem as noted already is that the '*' matches zero or
more digits
at the beginning, so the output is
'nothing space nothing' (the zero digits that were matched,
twice, separated by a space) followed by the rest of the string.

Use '+' instead...

Code:
echo "abc 123" |sed 's/[0-9]\+/& &/'
abc 123 123
Note the escaped '\+' which is not necessary with the -r option ('+' with -r).


I finally got it.

The digits were matched and from man sed:
Code:
s/regexp/replacement/
Attempt to match regexp against the pattern space.  
If successful, replace that  portion  matched  with 
replacement.   

The replacement may contain the special character & 
to refer to that portion of the pattern space which matched,
 and the special escapes \1 through \9 to refer to the 
corresponding  matching sub-expressions in the regexp.
This is how I finally understood it.
Since & is 123 when a match is found,
then the effective replacement of found
match (123) in this case is /& &/ = /123 123/
This "123 abc" becomes "123 123 abc".
Thanks.

Last edited by fanoflq; 03-03-2017 at 05:02 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
sed regex one or none '?' casperdaghost Linux - Newbie 5 06-21-2012 09:44 AM
sed and regex help zski128 Programming 5 12-13-2011 10:30 AM
knowing the stream editor SED yawe_frek Programming 2 12-31-2007 11:18 AM
sed (stream editor) problem igor.R Linux - Newbie 11 12-01-2007 10:44 PM
regex with sed to process file, need help on regex dwynter Linux - Newbie 5 08-31-2007 05:10 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 08:38 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration