LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Bash: extended regex pattern 'NOT' disabled inside parameter expansion? (https://www.linuxquestions.org/questions/linux-general-1/bash-extended-regex-pattern-not-disabled-inside-parameter-expansion-930404/)

romagnolo 02-20-2012 07:17 PM

Bash: extended regex pattern 'NOT' disabled inside parameter expansion?
 
I wouldn't wish to keep up my bash saga here. But web seems empty on this.
I just found that some of the extended features of pathname expansions (those that get activated by the shell built-in shopt -s extglob) don't work when the regex pattern is integrated into parameter expansions, but they do work singularly on paths, as they are meanly meant for.
Namely, I tested for the 'NOT' operator !(). On path names, I experience correct behaviours:
Code:

$ shopt -s extglob
$ ls
full.jpg  full.jpg.meta  half.jpg  half.jpg.meta
$ ls !(*meta)
full.jpg  half.jpg

but the equivalent job bounding the regex with a parameter expansion, the pattern substitution, does not work:
Code:

$ shopt -s extglob
$ ls
full.jpg  full.jpg.meta  half.jpg  half.jpg.meta
$ x=$(ls)
$ echo _${x//!(*meta)/}_
__

I would expect the regex pattern to expand like it does in the first box, match everything not containing *meta, substitute the matches with an empty string, and feed 'echo' with the 2 remaining filenames 'full.jpg' and 'half.jpg'.
Also, I think this time The Manual is by my side :)
Quote:

${parameter/pattern/string}
Pattern substitution. The pattern is expanded to produce a pattern just as in pathname expansion. Parameter is expanded and the longest match of pattern against its value is replaced with string. If pattern begins with /, all matches of pattern are replaced with string. Normally only the first match is replaced. [...] If string is null, matches of pattern are deleted and the / following pattern may be omitted.

catkin 02-20-2012 10:37 PM

The difference is caused by the pattern matching in ls !(*meta) operating on each file individually whereas ${x//!(*meta)/} operates on the whole string.

The first match for !(*meta) is "full.jpg full.jpg.meta half.jpg half.jpg.met". This is removed leaving "a" which also matches !(*meta) so is removed as the second match:
Code:

c@CW8:/tmp/try$ echo _${x/!(*meta)/}_
echo _a_

Incidentally, x=$(ls) does not put the same value in x as is shown by ls run at the command prompt. ls adjusts its ouptput formatting according to what it is writing to. The -x option can be used to workaround this behaviour:
Code:

c@CW8:/tmp/try$ ls
full.jpg  full.jpg.meta  half.jpg  half.jpg.meta
c@CW8:/tmp/try$ x=$(ls)
c@CW8:/tmp/try$ echo "$x"
full.jpg
full.jpg.meta
half.jpg
half.jpg.meta
c@CW8:/tmp/try$ x=$(ls -x)
c@CW8:/tmp/try$ echo "$x"
full.jpg  full.jpg.meta  half.jpg  half.jpg.meta

A solution to your requirement is to pattern match space-separated words in the ls output:
Code:

c@CW8:/tmp/try$ x=$(ls -x)
c@CW8:/tmp/try$ echo _${x//*( )*([^ ])meta/}_
_full.jpg half.jpg_

That will fail when file names include space(s) so it's more robust to parse ls output which has one file name per line:
Code:

c@CW8:/tmp/try$ x=$(/bin/ls -1)
c@CW8:/tmp/try$ echo _${x//$'\n'*([^$'\n'])meta/}_
_full.jpg half.jpg_

But parsing ls output can never be fully robust for reasons explained here.

romagnolo 02-21-2012 09:23 AM

Quote:

Originally Posted by catkin (Post 4607865)
The difference is caused by the pattern matching in ls !(*meta) operating on each file individually whereas ${x//!(*meta)/} operates on the whole string.

This is a valuable information the manual avoids to specify. I guess you learnt this by experience, or there is some special secret webpage somewhere?

Quote:

The first match for !(*meta) is "full.jpg full.jpg.meta half.jpg half.jpg.met". This is removed leaving "a" which also matches !(*meta) so is removed as the second match:
Code:

c@CW8:/tmp/try$ echo _${x/!(*meta)/}_
echo _a_

Incidentally, x=$(ls) does not put the same value in x as is shown by ls run at the command prompt. ls adjusts its ouptput formatting according to what it is writing to. The -x option can be used to workaround this behaviour: [...]
Thank you. Actually I noticed that spare "a" when using ${x/ instead of ${x//, but still I can't figure out how the pattern gets laid upon the match. Using the single-line output provided by the better ls -x, in my theory matching should be:
  • *meta : match the sub-string composed of any character terminated by "meta" on the right.
    So, although diverging from the original goal, it should match the whole string: "full.jpg full.jpg.meta half.jpg half.jpg.meta"
  • !(*meta) : it should invert the previous match of *meta, giving an empty string.
But in practice, the pattern generates two matches (why?): "full.jpg full.jpg.meta half.jpg half.jpg.met" and "a".

catkin 02-21-2012 10:00 PM

No secret web page -- just curiosity and experimentation, seasoned with a dash of experience.

Rather than your concept of "match *meta and discard it from the match", my concept is "match the longest string that does not match *meta".

Both are reasonable concepts of how things might work but experimentation on bash shows it behaving according to the latter.

romagnolo 02-22-2012 04:39 PM

Thank you catkin.


All times are GMT -5. The time now is 01:33 PM.