Originally Posted by pixellany
Suppose I do this:
(list all files beginning with the letter "l")
Am I using regexes or just bash syntax?
The first argument to grep is a regex.
I think some explanation of how and when the shell expands patterns is in order.
One of the big misconceptions people have is that when you do
, that ls is run, and gets passed the parameter *.txt
. This is incorrect. You hit return, the shell interprets the command, first
expanding the pattern *.txt
to the list of files which match the pattern, then then passing the list to ls as arguments. So if you have two files which match the pattern in your current workings directory, one.txt
, ls will get two arguments, one.txt
You can prevent the shell from pre-expanding the pattern by quoting it or escaping the * with a backslash character, \. For example, both of these commands will not expand the pattern - they will pass the string *.txt
to the ls command:
In this case, unless you actually have a file called *.txt (unlikely since * is a special character, but not impossible), ls will complain:
ls: *.txt: No such file or directory
Important note: There is one more time the program will see the literal string *.txt
- if there are no files which match the glob pattern, the shell will pass the original pattern.
Note that the shell has no way to know what a program expects as arguments. For programs which want the pattern, and not the pre-expanded list, you must quote the glob characters to prevent the shell from pre-expanding them.
The command grep takes a regular expression pattern as it's first non-option argument. Regular expressions use some characters which are also used in shell glob patterns, so if you use them, you need to quote or escape those characters to prevent the shell trying to interpret them.
For example, if we are in a directory with the files antelopes.txt apples.txt ardvaarks.txt
in it, and you issue the command:
...the behaviour will be as follows:
- The shell will expand the glob pattern a*to the list of files which match this pattern: antelopes.txt apples.txt ardvaarks.txt
- The shell will start a grep process with this expanded pattern list as arguments: grep antelopes.txt apples.txt ardvaarks.txt
- The shell will run ls, and pass the output to the input of the grep process.
If you don't know that the shell is pre-expanding the pattern before passing the arguments to grep you might think it should show a list of the three files. Not so. grep's behaviour is to process standard input only if no files specified on the command line, but it gets three arguments: apples.txt ardvaarks.txt antelopes.txt
. It will treat the first one as a regular expression pattern, and the subsequent ones as files in which to search for that pattern.
What's particularly confusing about this is that the behaviour is different depening on which directory you are in. If you were in a directory containing the files bananas batter balloon
, you would see all three files as the output of the command:
This is because there are no filenames which match the glob pattern a*, so the literal string a* is passed as the only argument to grep, which treats it as a regular expression which means "match any lines containing 0 or more a characters", i.e. any line at all.
If all this has left you feeling uncertain, don't worry too much about it for now. You will need to understand this to use the shell properly - especially if you're going to write shell scripts which might be used for something important - but there is a practise you can adopt to avoid these problems: quote patterns for programs which expact patterns. This means grep, find (with the -name option), sed, awk etc.
I hope that helps and wasn't too boring. By the way, if your books don't mention shell patterns, throw them away. The best way to learn is to try, fail, read the man page, fail again, read the man page, fail again, read the man page again, succeed and drink a celebratory cup of tea.
Get used to the format of the manual pages. The bash manual page is full of useful information, although it suffers from being dauntingly large, and it's probably wise to learn from a bunch of tutorials and use the man page as a reference. All smaller utilities though - the man page is the best source for information. The structure of man pages is such that the most important and concise information is at the top, and it gets more detailed as you move down the page. Find out what "man -k" does, and apropos too.