LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Number of backslash character to use in grep (https://www.linuxquestions.org/questions/programming-9/number-of-backslash-character-to-use-in-grep-901715/)

luvshines 09-08-2011 12:07 AM

Number of backslash character to use in grep
 
And old problem but still confuses me.

Have not been able to find any good documentation about the number of backslash characters to be used in grep command

If I try this, it fails
Code:

echo "my\name" | grep "my\name"
Though maybe grep needed a \\ since documentation says so but that too fails.
Adding another \, ie, using \\\ succeeds.
Then keeps on succeeding upto 6 backslashes, \\\\\\ and from 7th onwards starts failing

Things start getting even worse if \ appears at the end
Code:

echo "my\\" | grep "my\\"
This starts working with 4 backslashes only

This has confused me totally. With single quotes, it works differently, which is expected but still not clear how many to use.

Is it because of the piping of echo and grep that the behaviour changes ?
Any proper documentation which explains how this works

grail 09-08-2011 12:19 AM

Why not use single quotes and forgo the headache??

luvshines 09-08-2011 02:28 AM

Quote:

Originally Posted by grail (Post 4464585)
Why not use single quotes and forgo the headache??

The word/line to be grepped in my scripts are generally contained in variables which don't expand in single quotes '$var'

I also noticed that grep behavior changes if I have \ written on command line and when contained in variables

grail 09-08-2011 04:33 AM

Have a look about 2/3 down this page :- http://tldp.org/LDP/abs/html/escapingsection.html#ESCP

I think the explanation is quite clear.

luvshines 09-11-2011 02:35 PM

Quote:

Originally Posted by grail (Post 4465453)
Have a look about 2/3 down this page :- http://tldp.org/LDP/abs/html/escapingsection.html#ESCP

I think the explanation is quite clear.

:), the link itself says that the behaviour of backslashes is inconsistent. Do I have to live with this ?

Can I work with single quotes and still expand variables ?

8-bit 09-11-2011 03:19 PM

In your example, remember that the back-slash is a delimiter for grep.
So your example of echo 'my\\' | grep 'my\\' will not work.
Why?
Because you have not used a delimiter for the second back-slash in grep for the input string.
In other words, echo 'my\\' | grep 'my (delimiter then back-slash) only handles the first back-slash.
A second (delimiter then back-slash) is needed in grep to handle the second back-slash in your input string.
A delimiter is needed in grep to handle EACH reserved character in the input string.

Clear as mud?

ta0kira 09-11-2011 03:56 PM

Quote:

Originally Posted by luvshines (Post 4469062)
:), the link itself says that the behaviour of backslashes is inconsistent. Do I have to live with this ?

Can I work with single quotes and still expand variables ?

Single-quote the pattern except where variable expansion is needed. For that, take advantage of adjacent strings being concatenated, e.g. 'part1'"$var"'part2'.
Kevin Barry

luvshines 09-24-2011 10:01 AM

Quote:

Originally Posted by ta0kira (Post 4469119)
Single-quote the pattern except where variable expansion is needed. For that, take advantage of adjacent strings being concatenated, e.g. 'part1'"$var"'part2'.
Kevin Barry

That won't help either
Code:

search='\'
echo "\\" | grep $search
grep: Trailing backslash

echo "\\" | grep "$search"
grep: Trailing backslash

echo "\\" | grep ''$search''
grep: Trailing backslash


luvshines 09-24-2011 10:04 AM

Quote:

Originally Posted by 8-bit (Post 4469098)
In your example, remember that the back-slash is a delimiter for grep.
So your example of echo 'my\\' | grep 'my\\' will not work.
Why?
Because you have not used a delimiter for the second back-slash in grep for the input string.
In other words, echo 'my\\' | grep 'my (delimiter then back-slash) only handles the first back-slash.
A second (delimiter then back-slash) is needed in grep to handle the second back-slash in your input string.
A delimiter is needed in grep to handle EACH reserved character in the input string.

Clear as mud?

In the example I quoted, I used double-quotes echo "my\\" which will in effect give a single quote and going by your logic, grep 'my\\' should have worked since there is only one backslash in input string

8-bit 09-25-2011 12:16 PM

Try this code.
Code:

echo "my\name" | grep 'my\\name'
I am actually shooting in the dark here since I am trying to guess if you are searching for "dir/file" in a file containing file name paths. Your input string is the same. The only difference here is the addition of a delimiter and single quotes to the grep search.

David the H. 09-27-2011 04:19 PM

Why don't we take a look at the documentation, and learn just how the shell processes backslashes?

man bash:
Code:

QUOTING
        .....

        Enclosing characters in double quotes preserves the literal
        value of all characters within the quotes, with the exception
        of $, `, \, and, when history expansion is enabled, !.  The
        characters $ and ` retain their special meaning within double
        quotes.  The backslash retains its special meaning only when
        followed by one of the following characters: $, `, ", \, or
        <newline>
.  A double quote may be quoted within double quotes
        by preceding it with a backslash.  If enabled, history expansion
        will be performed unless an ! appearing in double quotes is
        escaped using a backslash. The backslash preceding the ! is not
        removed.

So if a string is double quoted, then \$, \`, \", \\, and \<newline> will all be converted to their literal equivalents. All other backslash combinations will be passed on literally (as opposed to not quoting the string, where all backslashes are processed).

Note especially that a single backslash at the end of the string will escape the following closing quote-mark, and break shell syntax.

The final processed string is then passed on to grep, which also does it's own backslash parsing.

Hopefully now you can figure out how many backslashes are needed to protect grep's reserved characters from interpretation. Using echo "string\with\backslashes" will print the string as grep (or any other command) would see it.

Finally, don't forget that backslash processing is also done when you declare the value of a variable, so you have to take that step into consideration as well. Using single quotes when setting the variable, to preserve the whole string literally, is probably what you'll usually want to do.

luvshines 10-02-2011 01:03 PM

Quote:

Originally Posted by David the H. (Post 4483806)

The final processed string is then passed on to grep, which also does it's own backslash parsing.

Ahh!! I think I was missing this. String processing done twice, grep doing its own too. I think echo was not doing it, unless I use echo -e

One last question, maybe a silly one, do all the commands do their own processing too for backslashes ?

David the H. 10-03-2011 11:57 PM

Your shell simply parses the line according to its syntax, then executes the resulting command and arguments. What happens next is up to the command in question.

If it's a command that parses complex expressions of some kind, then it will certainly use some kind of escaping to handle any characters that it considers reserved syntax. Regular expressions syntax generally also relies on the backslash, for example, for its escaping, so any command that parses regex will use them.


By the way, make a note that the regular expressions used in grep and sed have two different modes. In basic regex mode, certain characters are "off" by default, and backslashing them actually turns them on. Whereas if you enable the extended regex mode, the inverse occurs. These two expressions are equivalent, for example:

Code:

$ echo -e 'foo\nbar\nbaz' | grep '\(foo\|bar\)'
foo
bar

$ echo -e 'foo\nbar\nbaz' | grep -E '(foo|bar)'
foo
bar

See the "basic vs extended regex" section in the grep man page for more.


All times are GMT -5. The time now is 08:56 AM.