So, reading the last few posts there appear to be some misconceptions about the '-r' option and extended RegEx in GNU sed.
The most important thing to notice is that GNU sed by default understands extended RegEx. Supplying the '-r' option does
not add any additional functionality. It simply avoids the need for escaping them. E.g., the "+" is an extended RegEx. To have sed interpret it as such you have to prepend a backslash, like "\+". Using the -r option simply makes the backslash obsolete in most cases:
Code:
echo 'word' | sed 's/w\+/C/'
echo 'word' | sed -r 's/w+/C/' # same as above
Same goes for parenthesis:
Code:
echo 'word word' | sed 's/\(word\) \1/\1 CHANGE/'
echo 'word word' | sed -r 's/(word) \1/\1 CHANGE/' # same as above
However, this is not true for "\<" and "\>:
Code:
echo 'word' | sed 's/\<w.*\>/CHANGE/'
echo 'word' | sed -r 's/\<w.*\>/CHANGE/' # same as above
echo 'word' | sed -r 's/<w.*>/CHANGE/' # not same as above; expects literal '<' and '>' in input string.
So word boundary symbols "\<" and "\>" need to be escaped in any case. This behavior is a bit inconsistent.
@OP: You said that you are reading the tutorial by Bruce Barnett. I suppose you mean the tutorials on this site:
http://www.grymoire.com/Unix
You might get a bit confused when you read the tutorial about Regex in general on that site, especially this chapter:
http://www.grymoire.com/Unix/Regular.html#uh-12
It says that "\{" and "\}" are basic RegEx and that they cannot be used as extended RegEx. However, in the table further down it is marked as extended RegEx. This is contradictory.
Anyway, RegExes are a great source for confusion since every language/program seems to add its own small modifications to them.
BTW, this is how sed handles "{}":
Code:
echo 'hello' | sed 's/l\{2\}/CC/'
echo 'hello' | sed -r 's/l{2}/CC/' # same as above