LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   expansion in bash (https://www.linuxquestions.org/questions/linux-newbie-8/expansion-in-bash-4175611042/)

vincix 08-01-2017 08:19 AM

expansion in bash
 
There's this statement that I've come across in Advanced Bash Scripting (Inserting a blank line between paragraphs in a text file):
Code:

if [[ "$len" -lt "$MINLEN" && "$line" =~ [*{\.}]$ ]]
I don't know exactly how to interpret the bolded part. What is the meaning of the curly brackets in this context? Does it simply mean everything ending in a dot?

GazL 08-01-2017 11:55 AM

It's somewhat confusing because many of those characters have special meaning in an extended regex, however, they're all used within a square brackets (list specification) so they're treated as literals (with the exception of the '.' which still has special meaning in a list and needs to be escaped).

So, unless I'm reading it wrongly, it will match any line that ends with any of '*' '{' '}' or '.' characters.

edit: actually, I got that wrong. The \ and . are both literal too and it isn't an escape. God I hate reading regexs! ;)

MadeInGermany 08-01-2017 12:22 PM

@GazL, IMHO a dot in a [character set] is a literal dot. No need for escaping it; the preceding backslash does nothing and can be omitted.
BTW a literal backslash in a [character set] DOES need another backslash.

vincix 08-01-2017 01:14 PM

I don't know why I interpreted in such a complicated way. Now I understand that those are literal characters. I thought it was some special notation. Thank you :)

GazL 08-01-2017 02:07 PM

Quote:

Originally Posted by MadeInGermany (Post 5742736)
@GazL, IMHO a dot in a [character set] is a literal dot. No need for escaping it; the preceding backslash does nothing and can be omitted.
BTW a literal backslash in a [character set] DOES need another backslash.

I'd already realised my mistake about the dot and corrected my post, but \ doesn't need escaping within a regex list element (other than any escaping it might need to stop the shell interpreting it), as can be seen here:
e.g.
Code:

test@ws1:~$ [[ 'wibble\' =~ [\.]$ ]] && echo yes
test@ws1:~$ [[ 'wibble\' =~ [\\.]$ ]] && echo yes
yes
test@ws1:~$ echo wibble\\ | egrep '[\.]$'
wibble\

As can be seen from the grep example, it's the shell that requires the additional escape not the regex pattern itself. Interestingly, [[ .... =~ '[\.]$' ]] doesn't work however. which I find surprising.

Oh, and don't you just love consistency:

bash:
Code:

test@ws1:~$ [[ 'wibble\' =~ [\.]$ ]]  &&  echo yes
test@ws1:~$

ksh:
Code:

$ [[ 'wibble\' =~ [\.]$ ]]  &&  echo yes
yes
$

:banghead:

vincix 08-01-2017 04:13 PM

Related to this are the following lines in the script:
Quote:

# if [[ "$len" -lt "$MINLEN" && "$line" =~ \[*\.\] ]]
# An update to Bash broke the previous version of this script. Ouch!
# Thank you, Halim Srama, for pointing this out and suggesting a fix.
I'm also surprised that simple or double quotes don't work at all. I don't understand the reason why. I've also tried it in ksh, but it seems to behave the same.

GazL 08-01-2017 04:55 PM

It's clearly some weird parsing bug in bash as regex='[\.]' ; [[ 'wibble\' =~ $regex ]] && echo yes works

padeen 08-01-2017 06:50 PM

Quote:

Originally Posted by vincix (Post 5742867)
Related to this are the following lines in the script:

I'm also surprised that simple or double quotes don't work at all.

The reason is because the =~ token expects a regular expression to follow. If you quote it, the expression becomes a string which is then compared against, and fails.

josephj 08-02-2017 04:43 AM

Small perplexment
 
Why isn't this a syntax error for an unmatched single quote?

Code:

test@ws1:~$ [[ 'wibble\' =~ [\.]$ ]] && echo yes

vincix 08-02-2017 04:46 AM

I like how a seemingly simple question produced so much interesting stuff! :) I'm also interested in your question, josephj. What makes it more weird is that if you place the single quote in the regex between the square brackets, you do have to escape it and it only works that way.

GazL 08-02-2017 05:36 AM

Quote:

Originally Posted by josephj (Post 5743092)
Why isn't this a syntax error for an unmatched single quote?

Code:

test@ws1:~$ [[ 'wibble\' =~ [\.]$ ]] && echo yes

Within single quotes all characters are literal: the escape character loses its meaning.

Things get a little more complicated within double-quotes and different shell implementations can vary with how they interpret the escape character: it will depend on what follows it. Basically it's all a bit of a mess.

josephj 08-03-2017 04:26 AM

Regexes! Can't live with them. Can't live without them.
 
Regexes make me dizzy. I'm gradually getting better at them over the years, but I'm still a beginner.

I even read a whole O'Reilly book on them once and still didn't absorb much.

I can't find the actual strip at the moment, but User Friendly summed it up nicely:

Quote:

Upskilling Strip Date:Jun 28, 2007
Miranda, Sid
-
Sid: Hey, Miranda. Busy?
Miranda: Just trying out some new skills.
Miranda: I've been experimenting a lot to push the boundaries of my knowledge.
Sid: I'd say you've come a long way. Appraoching mastery, even.

Miranda: You really think so?
Sid: Well, for one thing I can't tell if that's a regex or line noise.

vincix 08-03-2017 04:34 AM

I don't get the joke, but I'm never any good at those :)

josephj 08-04-2017 03:03 AM

The bad old days ...
 
It doesn't work so well without the cartoon (he's looking over her shoulder at her CRT terminal) - and in the bad old days, we used 300 baud modems (and 110 before that - divide by roughly 10 for characters per second) to dial into time sharing systems. If the line quality was low, you'd get garbage characters sent or received (generated by noise on the transmission lines - the encoding used analog tones) - which don't look that different from a typical regex.

GazL 08-04-2017 04:08 AM

The realisation that there are generations out there that have never experienced serial terminals and line noise and need this joke explaining to them suddenly makes me feel very old.


All times are GMT -5. The time now is 04:35 AM.