LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 08-21-2011, 09:20 AM   #1
915086731
Member
 
Registered: Apr 2010
Posts: 121
Blog Entries: 5

Rep: Reputation: 2
Some question when using regular expression , ask for help!


Please see the following code,
Code:
[saturn@saturn-pc new]$ [[ "aab" =~ ab ]] && echo "ok" || echo "error";
ok
[saturn@saturn-pc new]$ [[ "aab" =~ "ab" ]] && echo "ok" || echo "error";
ok
[saturn@saturn-pc new]$
"aab" should not match ab , Can you tell me why?

Code:
[saturn@saturn-pc new]$ [[ "aab" =~ a*b ]] && echo "ok" || echo "error"
ok
[saturn@saturn-pc new]$ [[ "aab" =~ *ab ]] && echo "ok" || echo "error"
error
[saturn@saturn-pc new]$
why *ab does not match "aab" ?
Thanks!
 
Old 08-21-2011, 09:33 AM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,504

Rep: Reputation: 1079Reputation: 1079Reputation: 1079Reputation: 1079Reputation: 1079Reputation: 1079Reputation: 1079Reputation: 1079
Quote:
Originally Posted by 915086731 View Post
"aab" should not match ab , Can you tell me why?
Can you tell us why you think it should not. Perhaps explain what you think "=~" means
Quote:
why *ab does not match "aab" ?Thanks!
Again, tell us why you think it should - regex is not (shell) globbing.
 
Old 08-21-2011, 09:34 AM   #3
jlinkels
Senior Member
 
Registered: Oct 2003
Location: Bonaire
Distribution: Debian Wheezy/Jessie/Sid, Linux Mint DE
Posts: 4,246

Rep: Reputation: 557Reputation: 557Reputation: 557Reputation: 557Reputation: 557Reputation: 557
In a regular expression '*' does not mean match any character zero or more occurences.

This is confusing, as in filename matching it does. If you tried the ls command, ls *ab would match aab.

What matches any characted in a regular expression is '.' (period). So matching any character zero or more times would be '.*'. To find it at the start of a line, use '^.*'.

jlinkels
 
1 members found this post helpful.
Old 08-21-2011, 09:42 AM   #4
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Quote:
Originally Posted by 915086731 View Post
Please see the following code,
Code:
[saturn@saturn-pc new]$ [[ "aab" =~ ab ]] && echo "ok" || echo "error";
ok
[saturn@saturn-pc new]$ [[ "aab" =~ "ab" ]] && echo "ok" || echo "error";
ok
[saturn@saturn-pc new]$
"aab" should not match ab , Can you tell me why?
On the contrary, aab matches the regular expression ab, since it does contain the substring ab. If you want to match only the ab literal string, you have to insert word boundaries in the regular expression:
Code:
$ [[ "aab" =~ \\bab\\b ]] && echo "ok" || echo "error"
error
$ [[ "ab" =~ \\bab\\b ]] && echo "ok" || echo "error"
ok
where the \b is the word boundary specification and the preceding backslash is to escape the backslash, so that bash interprets it correctly.

Anyway this behaviour works only until bash version 3.1. To do the same in bash 3.2 and newer, you have to set the option compat31:
Code:
shopt -s compat31

Last edited by colucix; 08-21-2011 at 09:44 AM.
 
1 members found this post helpful.
Old 08-21-2011, 09:50 AM   #5
915086731
Member
 
Registered: Apr 2010
Posts: 121
Blog Entries: 5

Original Poster
Rep: Reputation: 2
Thanks syg00
I thinks ab is only a fixed string, it does not contain any metacharacter such as . or * or ?, so I am very puzzled.

Code:
[saturn@saturn-pc new]$ [[ "aab" =~ ab ]] && echo "ok" || echo "error"
ok
[saturn@saturn-pc new]$ [[ "aaaaab" =~ ab ]] && echo "ok" || echo "error"
ok             !!ab seems has the same effect of a*b !!
[saturn@saturn-pc new]$ [[ "aaaaab" =~ a*b ]] && echo "ok" || echo "error"
ok
[saturn@saturn-pc new]$ [[ "aabaab" =~ a*b ]] && echo "ok" || echo "error"
ok             !!a*b means repeat "a" zero or more times, but it seems match any char one or more times !!
 
Old 08-21-2011, 10:48 AM   #6
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Regular expression is not the same as shell globbing...if you want to match "ab", then use a fixed string...in the shell, you can just do a simple case/esac without using regular expression
Code:
case "aab" in
"ab" ) echo "ok";;
*) echo "not ok";;
esac
or the if/else statement....using the = sign.
 
1 members found this post helpful.
Old 08-21-2011, 11:11 AM   #7
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,396
Blog Entries: 2

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Quote:
Originally Posted by ghostdog74 View Post
Regular expression is not the same as shell globbing..
Exactly correct, and it is not the same as string comparison. A regular expression tries to match any part of the string against which it is tested. The regex 'ab' will match any substring of 'aab', and since there is a substring 'ab', the test evaluates to 'True'.
--- rod.
 
1 members found this post helpful.
Old 08-21-2011, 11:37 AM   #8
915086731
Member
 
Registered: Apr 2010
Posts: 121
Blog Entries: 5

Original Poster
Rep: Reputation: 2
Thanks all
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular Expression Question windisch Programming 8 05-22-2007 04:27 PM
Regular expression question gauge73 Linux - General 5 10-28-2005 12:33 PM
Regular expression question gauge73 Linux - General 2 10-28-2005 10:32 AM
Regular expression question. groentebroer Programming 2 11-29-2004 10:15 PM
regular expression question Gantrep Linux - Software 2 04-20-2003 05:24 PM


All times are GMT -5. The time now is 11:16 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration