[SOLVED] Number of backslash character to use in grep
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Have not been able to find any good documentation about the number of backslash characters to be used in grep command
If I try this, it fails
Code:
echo "my\name" | grep "my\name"
Though maybe grep needed a \\ since documentation says so but that too fails.
Adding another \, ie, using \\\ succeeds.
Then keeps on succeeding upto 6 backslashes, \\\\\\ and from 7th onwards starts failing
Things start getting even worse if \ appears at the end
Code:
echo "my\\" | grep "my\\"
This starts working with 4 backslashes only
This has confused me totally. With single quotes, it works differently, which is expected but still not clear how many to use.
Is it because of the piping of echo and grep that the behaviour changes ?
Any proper documentation which explains how this works
In your example, remember that the back-slash is a delimiter for grep.
So your example of echo 'my\\' | grep 'my\\' will not work.
Why?
Because you have not used a delimiter for the second back-slash in grep for the input string.
In other words, echo 'my\\' | grep 'my (delimiter then back-slash) only handles the first back-slash.
A second (delimiter then back-slash) is needed in grep to handle the second back-slash in your input string.
A delimiter is needed in grep to handle EACH reserved character in the input string.
, the link itself says that the behaviour of backslashes is inconsistent. Do I have to live with this ?
Can I work with single quotes and still expand variables ?
Single-quote the pattern except where variable expansion is needed. For that, take advantage of adjacent strings being concatenated, e.g. 'part1'"$var"'part2'.
Kevin Barry
Single-quote the pattern except where variable expansion is needed. For that, take advantage of adjacent strings being concatenated, e.g. 'part1'"$var"'part2'.
Kevin Barry
In your example, remember that the back-slash is a delimiter for grep.
So your example of echo 'my\\' | grep 'my\\' will not work.
Why?
Because you have not used a delimiter for the second back-slash in grep for the input string.
In other words, echo 'my\\' | grep 'my (delimiter then back-slash) only handles the first back-slash.
A second (delimiter then back-slash) is needed in grep to handle the second back-slash in your input string.
A delimiter is needed in grep to handle EACH reserved character in the input string.
Clear as mud?
In the example I quoted, I used double-quotes echo "my\\" which will in effect give a single quote and going by your logic, grep 'my\\' should have worked since there is only one backslash in input string
I am actually shooting in the dark here since I am trying to guess if you are searching for "dir/file" in a file containing file name paths. Your input string is the same. The only difference here is the addition of a delimiter and single quotes to the grep search.
Why don't we take a look at the documentation, and learn just how the shell processes backslashes?
man bash:
Code:
QUOTING
.....
Enclosing characters in double quotes preserves the literal
value of all characters within the quotes, with the exception
of $, `, \, and, when history expansion is enabled, !. The
characters $ and ` retain their special meaning within double
quotes. The backslash retains its special meaning only when
followed by one of the following characters: $, `, ", \, or
<newline>. A double quote may be quoted within double quotes
by preceding it with a backslash. If enabled, history expansion
will be performed unless an ! appearing in double quotes is
escaped using a backslash. The backslash preceding the ! is not
removed.
So if a string is double quoted, then \$, \`, \", \\, and \<newline> will all be converted to their literal equivalents. All other backslash combinations will be passed on literally (as opposed to not quoting the string, where all backslashes are processed).
Note especially that a single backslash at the end of the string will escape the following closing quote-mark, and break shell syntax.
The final processed string is then passed on to grep, which also does it's own backslash parsing.
Hopefully now you can figure out how many backslashes are needed to protect grep's reserved characters from interpretation. Using echo "string\with\backslashes" will print the string as grep (or any other command) would see it.
Finally, don't forget that backslash processing is also done when you declare the value of a variable, so you have to take that step into consideration as well. Using single quotes when setting the variable, to preserve the whole string literally, is probably what you'll usually want to do.
Your shell simply parses the line according to its syntax, then executes the resulting command and arguments. What happens next is up to the command in question.
If it's a command that parses complex expressions of some kind, then it will certainly use some kind of escaping to handle any characters that it considers reserved syntax. Regular expressions syntax generally also relies on the backslash, for example, for its escaping, so any command that parses regex will use them.
By the way, make a note that the regular expressions used in grep and sed have two different modes. In basic regex mode, certain characters are "off" by default, and backslashing them actually turns them on. Whereas if you enable the extended regex mode, the inverse occurs. These two expressions are equivalent, for example:
Code:
$ echo -e 'foo\nbar\nbaz' | grep '\(foo\|bar\)'
foo
bar
$ echo -e 'foo\nbar\nbaz' | grep -E '(foo|bar)'
foo
bar
See the "basic vs extended regex" section in the grep man page for more.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.