LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-29-2011, 03:02 AM   #1
915086731
Member
 
Registered: Apr 2010
Posts: 144
Blog Entries: 6

Rep: Reputation: 2
What's the difference between \d , [:digit:], and [0-9] in regular expression ?


Hello,
Code:
[river@localhost ate]$ [[ "123" =~ \d ]] && echo "ok" || echo "error";
error
[river@localhost ate]$ [[ "123" =~ [:digit:] ]] && echo "ok" || echo "error";
error
[river@localhost ate]$ [[ "123" =~ [0-9] ]] && echo "ok" || echo "error";
ok
[river@localhost ate]$
It seems that \d , [:digit:], and [0-9] are not the same.According to the regular expression reference, \d , [:digit:], and [0-9] have the same meaning, which represent a digit, but why not them work on linux?

Code:
[river@localhost ate]$ [[ "123" =~ \b[0-9]{3}\b ]] && echo "ok" || echo "error";
error
I am very puzzled for the above, "123" should match \b[0-9]{3}\b, but why it not ?
Thanks!

Last edited by 915086731; 08-29-2011 at 03:08 AM.
 
Old 08-29-2011, 04:36 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Where are you reading the information that all of these should work in bash? Also I think you might want to look up character classes, your digit example, to see their proper use.
 
Old 08-29-2011, 08:18 PM   #3
915086731
Member
 
Registered: Apr 2010
Posts: 144

Original Poster
Blog Entries: 6

Rep: Reputation: 2
Thanks , as [0-9] works, but why "\b[0-9]{3}\b" does not work ?
Code:
[river@localhost ate]$ [[ "123" =~ \b[0-9]{3}\b ]] && echo "ok" || echo "error";
error
 
Old 08-29-2011, 08:40 PM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
I find the best way to use regexes in bash is to assign them to a variable first, I believe this helps to not worry
about escape sequences. Hence:
Code:
reg='\b[0-9]{3}\b'

[[ "123" =~ $reg ]] && echo "ok" || echo "error"
 
2 members found this post helpful.
Old 08-29-2011, 08:46 PM   #5
kurumi
Member
 
Registered: Apr 2010
Posts: 228

Rep: Reputation: 53
Quote:
Originally Posted by 915086731 View Post
Thanks , as [0-9] works, but why "\b[0-9]{3}\b" does not work ?
Code:
[river@localhost ate]$ [[ "123" =~ \b[0-9]{3}\b ]] && echo "ok" || echo "error";
error
you using Fedora?
 
Old 08-30-2011, 01:21 AM   #6
Diantre
Member
 
Registered: Jun 2011
Distribution: Slackware
Posts: 515

Rep: Reputation: 234Reputation: 234Reputation: 234
Quote:
Originally Posted by 915086731 View Post
Code:
[[ "123" =~ \d ]] && echo "ok" || echo "error";
It seems Bash doesn't understand \d as other regex engines do. In Bash, \d is a literal "d", not a decimal number. I checked the man pages and couldn't find any reference to \d being interpreted as a decimal number. I could be wrong on that, though...

Quote:
Originally Posted by 915086731 View Post
Code:
[[ "123" =~ [:digit:] ]] && echo "ok" || echo "error";
You have to put [:digit:] inside a character class:

Code:
[[ "123" =~ [[:digit:]] ]] && echo "ok" || echo "error";

Quote:
Originally Posted by 915086731 View Post
Code:
[[ "123" =~ \b[0-9]{3}\b ]] && echo "ok" || echo "error";
Here Bash interprets \b as a literal "b". If you try this it will output "ok":

Code:
[[ "b123b" =~ \b[0-9]{3}\b ]] && echo "ok" || echo "error";
I'm guessing that you're trying to use \b as a word boundary assertion. Word boundaries work fine in grep with either \b or \< and \>. In PCRE \b also works fine as a word boundary. But I can get it to work in Bash, so I'm beginning to think that either Bash doesn't support \b or I'm doing something wrong. Probably the latter.

Have a look at the manpages for further information: grep(1), regex(7), bash(1) and pcre(3).

Hope that helps.
 
1 members found this post helpful.
Old 08-30-2011, 08:11 PM   #7
915086731
Member
 
Registered: Apr 2010
Posts: 144

Original Poster
Blog Entries: 6

Rep: Reputation: 2
Thanks very much! They are the best answer .
 
Old 08-31-2011, 02:54 AM   #8
gnashley
Amigo developer
 
Registered: Dec 2003
Location: Germany
Distribution: Slackware
Posts: 4,928

Rep: Reputation: 612Reputation: 612Reputation: 612Reputation: 612Reputation: 612Reputation: 612
Single quotes around the expression, fellows:
Code:
[[ "123" =~ '\b[0-9]{3}\b' ]] && echo "ok" || echo "error";
ok
 
Old 08-31-2011, 01:16 PM   #9
Diantre
Member
 
Registered: Jun 2011
Distribution: Slackware
Posts: 515

Rep: Reputation: 234Reputation: 234Reputation: 234
Thanks gnashley, but even with single quotes I still get "error" with Bash 4.1.
 
Old 08-31-2011, 01:27 PM   #10
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
I thought that you shouldn't put quotes around the regexp when using bash's "=~" syntax.
 
Old 08-31-2011, 04:08 PM   #11
Diantre
Member
 
Registered: Jun 2011
Distribution: Slackware
Posts: 515

Rep: Reputation: 234Reputation: 234Reputation: 234
You can quote any part of the pattern to force a string match, according to the Bash manpage. But I think you're right MTK358, with single quotes, it seems Bash is matching a literal string.

Code:
$ [[ "123" =~ '\b[0-9]{3}\b' ]] && echo "ok" || echo "error";
error

$ [[ "\b[0-9]{3}\b" =~ '\b[0-9]{3}\b' ]] && echo "ok" || echo "error";
ok

$ [[ "123" =~ '[0-9]{3}' ]] && echo "ok" || echo "error";
error

$ [[ "123" =~ [0-9]{3} ]] && echo "ok" || echo "error";
ok
 
Old 08-31-2011, 07:56 PM   #12
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
@Diantre - Did you try my solution from post #4?
 
Old 08-31-2011, 10:48 PM   #13
Diantre
Member
 
Registered: Jun 2011
Distribution: Slackware
Posts: 515

Rep: Reputation: 234Reputation: 234Reputation: 234
Quote:
Originally Posted by grail View Post
@Diantre - Did you try my solution from post #4?
Yes, thanks! Your solution works fine. It's just that I'm baffled why it doesn't work the other way, with the regex inside the test, that's all. Putting the regex in a variable, as you suggest, does the trick.
 
Old 09-01-2011, 01:17 AM   #14
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
If you look on Greg's Wiki at the following page :- http://mywiki.wooledge.org/BashGuide/Patterns

You will find the following:
Quote:
Be aware that regex parsing in BASH has changed between releases 3.1 and 3.2. Before 3.2 it was safe to wrap your regex pattern in quotes but this has changed in 3.2. Since then, regex should always be unquoted. You should protect any special characters by escaping it using a backslash. The best way to always be compatible is to put your regex in a variable and expand that variable in [[ without quotes.
 
1 members found this post helpful.
Old 09-01-2011, 02:24 AM   #15
Diantre
Member
 
Registered: Jun 2011
Distribution: Slackware
Posts: 515

Rep: Reputation: 234Reputation: 234Reputation: 234
Quote:
Originally Posted by grail View Post
If you look on Greg's Wiki at the following page...
Ahhh! Ok, so that explains it! Thanks for the heads-up grail.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular expression to Grep for a n digit number somupl86 Linux - General 7 11-24-2010 06:11 AM
Substitue single-digit, two-digit, and 3-digit numbers with text using sed dmason165 Programming 13 08-07-2009 10:38 AM
Regular Expression harkonen Programming 6 07-12-2008 12:06 PM
regular expression (.*?) uttam_h Programming 6 05-30-2008 05:45 PM
Regular Expression slizadel Programming 4 07-28-2003 05:16 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 08:55 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration