bash regexp string compare stopped working
Have a bash script which contains a line like this:
if [[ ${array[${last}]} =~ "screenpc.PRODUCTION.*" ]] which WORKED as expected in bash 4.0.33 and now fails in 4.1.2 Instrumented the script to print the value of the left-hand side and it is exactly what is expected. As noted above, this has been working fine until we installed Fedora 13 (kernel 2.6.33), and now it fails. Tried setting shell 'extglob' to On with same results. Did something change? Are there other shell/bash options that need to be set? Thanks for any help -- this has the whole installation stopped! |
also tried compat31
Turned on this shell option -- still getting incorrect results.
|
Not sure why it worked previously, but the asterisk inside double quotes is treated literally. For a correct pattern matching you can try
Code:
if [[ ${array[${last}]} =~ screenpc.PRODUCTION.* ]] Code:
if [[ ${array[${last}]} =~ screenpc.PRODUCTION.*something ]] Edit: after a little search I found the rule introduced in bash 3.2 which changes the behavior in respect of previous versions: from the bash reference manual: Quote:
Quote:
|
That works -- thanks.
Also works dropping the .* as you said, although this still confuses me. The left-side string does have more characters (e.g. screenpc.PRODUCTION.20100115), so I thought the final .* would be needed so that the regexp actually matched. You are implying that the =~ will be true if the right-hand side matches anything within the left-hand side? i.e. implicitly .*matchthis.* |
Quote:
In this case the expression Code:
screenpc.PRODUCTION.* Code:
somethingherescreenpc.PRODUCTIONsomethingelse The question is: do you want to match a string like Code:
screenpc.PRODUCTION.20100115 Code:
if [[ ${array[${last}]} =~ screenpc\.PRODUCTION\.[0-9]{8} ]] Moreover, if you want to match the exact string without any other character before or after the string itself, you can use anchors as in Code:
if [[ ${array[${last}]} =~ ^screenpc\.PRODUCTION\.[0-9]{8}$ ]] Code:
if [[ ${array[${last}]} =~ ^"screenpc.PRODUCTION."[0-9]{8}$ ]] |
Although I do this with great trepidation, I need to make an amendment to colucix's post:
Quote:
Code:
screenpcPRODUCTION |
Hi grail! :) Actually the question mark was a personal notation (not syntax). BTW, thank you for the notification, I should have chosen another character or maybe another color to avoid confusion.
|
Still somewhat confused, so I'm missing something. Here's how I interpreted the matching logic -- could you tell me where I'm off track?
aabbcc =~ aabbcc matches trivially since the strings are identical aabbcc =~ aabb.* matches -- the 'aabb' sections match exactly, then the .* matches the 'cc' section aabbcc =~ aabb does not match -- the 'aabb' sections match but there is nothing in the regexp to match the 'cc' portion on the left aabbcc =~ bbcc does not match -- the 'bbcc' sections match, but there is nothing in the regexp to match the 'aa' portion aabbcc =~ .*bbcc matches -- the .* matches the leading 'aa', and then the 'bbcc' sections match aabbcc =~ .*bb.* matches -- the first .* matches the 'aa', then the 'bb' sections match, then the trailing .* matches the 'cc' So in the actual case, I would expect these results: screenpc.PRODUCTION.20100908 =~ screenpc.PRODUCTION.* matches -- the 'screenpc' matches, the first '.' matches any character (which just happens to also be a '.' in the original string), then 'PRODUCTION' matches, and finally the '.*' matches any set of trailing characters -- '.20100908' in this case. screenpc.PRODUCTION.20100908 =~ screenpc.PRODUCTION does not match -- the 'screenpc.PRODUCTION' section matches as above, but then there is nothing in the regexp to match the '.20100908' portion of the original string. If the last case does in fact produce a match, then I would think that the definition of the '=~' operator needs to be stated as: "the regexp on the right matches A SUBSTRING in the string on the left" i.e. it's more a 'search for a string' as opposed to 'the regexp matches the string on the left' -- which may in fact be the actual definition, and the book I've looked at is imprecise. Sorry to belabor the point, but since the system does in fact appear to match the last example as you said, it's clear that I'm missing something fundamental and would like to get it straight. Thanks |
Actually you miss a main point: a regular expression is a kind of search pattern. You have a string of any length and a regular expression which describes a sequence of characters to be searched inside the string. In other words a string matches a regular expression when it contains the minimal sequence of characters described by the regular expression itself.
Hence it is not mandatory to write a regular expression that matches the entire string. Nevertheless you can refine the regular expression to match only the string (or a set of possible strings) you want. An example of regular expression refinement: the following: Code:
. Code:
... Code:
^...$ |
Yes, that is exactly the clarification I needed.
Thanks. |
Glad you go there :) Please mark as SOLVED now you have a solution.
|
All times are GMT -5. The time now is 06:39 AM. |