confused about usage of quotes and wildcards in linux
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
confused about usage of quotes and wildcards in linux
Hello everybody
I am struggling to learn command line in Linux and have noticed a confusing (at least for me ) behaviour of wildcards and quotting
E.g :
here, you get a different behaviour ( th is not treated s '*th*') and presumably in *th* , * is considered as "all"
So, is the behaviour of quotting and/or wildcards different depending on the command or am I missing something else ?
Thanks in advance
Click here to see the post LQ members have rated as the most helpful post in this thread.
Hello everybody
I am struggling to learn command line in Linux and have noticed a confusing (at least for me ) behaviour of wildcards and quotting
E.g :
here, you get a different behaviour ( th is not treated s '*th*') and presumably in *th* , * is considered as "all"
So, is the behaviour of quotting and/or wildcards different depending on the command or am I missing something else ?
Thanks in advance
The command you are using may have its own wildcard expansion, so it may itself expand '*te*' you are giving it on command line.
Try the following program:
Code:
#!/usr/bin/perl
use strict;
use warnings;
warn "command line arguments - one per line:\n";
my $arg_number = 0;
foreach my $arg(@ARGV)
{
warn "[$arg_number] $arg\n";
$arg_number++;
}
to see how your shell expands wildcards. I.e. save the code, as, say, 'args.pl' file, give the file executable permission and try it with, say,
Code:
./args.pl *te*
.
If you want the test program to print to stdout and not stderr as it does, replces 'warn' with 'print'.
Last edited by Sergei Steshenko; 03-05-2011 at 10:33 AM.
As you are using Ubuntu, you are more than likely running dash as a shell. This is quite similar to bash but with some different tweaks (none of which I think are affecting you here).
Sergei is quite right in saying that different applications / commands will have their own idiosyncrasies, but in this case I would say in all the situations where you have not quoted
the string it is expanding to what it finds in the directory you are running in prior executing the command.
Your first 2 examples are a good demonstration of this:
When you quote or remove the wildcards then locate only has the information at hand to use, however, when using the unquoted wildcard version it first expands the string to whatever it finds locally.
Hence if in the directory you have a file called:
Code:
after
arcane
Now locate will only look for things that have these words / strings appear inside a file or directory name.
Once quoted though, this now effectively passes the wildcards to locate and it performs its' own version of expansion as part of its' lookup.
There are two levels of processing going on. First the shell processes the command line, then it passes it to the program, which does whatever it's designed to do.
In the locate example without quoting, the shell first tries to match files in the current directory and builds anything it matches into a list of filenames. This list is then passed to locate. Only if there are no matches does it pass the string on literally.
Try running echo locate *te* and see what you get.
Now lets look at what the locate man page says about how it processes the strings it gets:
Code:
If a pattern is a plain string — it contains no metacharacters — locate displays
all file names in the database that contain that string anywhere. If a pattern
does contain metacharacters, locate only displays file names that match the
pattern exactly.
So in example one, you appear to have approximately 29 files in your directory that contain the text "te", which are having their names passed to locate. locate then searches for those file names as plain text strings, and not coincidentally comes up with a list of the files in that directory (and perhaps a few identically named files in other locations).
The same thing is happening with find. The unprotected shell globbing is finding one match in the current directory for *th*, which when passed onto find is producing a syntax error due to the spaces in it.
So the lesson learned is that you should nearly always quote strings when they contain anything other than plain text, and sometimes even then.
Your shell (Bash, Dash, Sh, whatever) is expanding the unquoted wildcard character (*) before locate ever sees it.
A classic demonstration may be helpful. Try this on your own system and see what happens.
Code:
$ mkdir junk
$ cd junk
$ touch one two three four five
$ echo *e*
five one three
Here the shell expands the pattern before the echo command even begins to run. echo never sees the pattern, just arguments "five one three".
locate is capable of expanding patterns, but if you don't quote your pattern then the shell will eat it first. The same happens with find --name
Edit:
A bit of related advice ... if you ever use regular expression from the command line, enclose the entire regex inside single quotes. Leaving a regex exposed to the shell is asking for all kinds of confusion.
@grail
"say in all the situations where you have not quoted
the string it is expanding to what it finds in the directory you are running in prior executing the command."
I tested locate *te* and it gives me several results some of them in my home dir (where I run the command) and others outside it. A small sample is :
So it shouldnt be the case that unquoted *te* is searched only in the dir I am running it
When you quote or remove the wildcards then locate only has the information at hand to use, however, when using the unquoted wildcard version it first expands the string to whatever it finds locally.
Hence if in the directory you have a file called:
Code:
after
arcane
Now locate will only look for things that have these words / strings appear inside a file or directory name.
It was very important for me to realize that different applications / commands will have their own idiosyncrasies and I should memorize some things instead of trying to solve all problems with logic as well as to understand that there are the 2 levels of processing as you describe
There are two levels of processing going on. First the shell processes the command line, then it passes it to the program, which does whatever it's designed to do.
In the locate example without quoting, the shell first tries to match files in the current directory and builds anything it matches into a list of filenames. This list is then passed to locate. Only if there are no matches does it pass the string on literally.
Try running echo locate *te* and see what you get.
Now lets look at what the locate man page says about how it processes the strings it gets:
Code:
If a pattern is a plain string — it contains no metacharacters — locate displays
all file names in the database that contain that string anywhere. If a pattern
does contain metacharacters, locate only displays file names that match the
pattern exactly.
So in example one, you appear to have approximately 29 files in your directory that contain the text "te", which are having their names passed to locate. locate then searches for those file names as plain text strings, and not coincidentally comes up with a list of the files in that directory (and perhaps a few identically named files in other locations).
The same thing is happening with find. The unprotected shell globbing is finding one match in the current directory for *th*, which when passed onto find is producing a syntax error due to the spaces in it.
So the lesson learned is that you should nearly always quote strings when they contain anything other than plain text, and sometimes even then.
Well , thanks, definitely i should quote and the message is taken. \
But ..
Code:
ioannis@ioannis-laptop:~$ locate *te* | more -2
/etc/GNUstep
/etc/GNUstep/GNUstep.conf
And in none of the rest 27 results is there any *te*
I have very little experience and please correct me if I am wrong but I start understanding that it doesnt make sense to try to understand how command line responds when you dont follow the rules
Your shell (Bash, Dash, Sh, whatever) is expanding the unquoted wildcard character (*) before locate ever sees it.
A classic demonstration may be helpful. Try this on your own system and see what happens.
Code:
$ mkdir junk
$ cd junk
$ touch one two three four five
$ echo *e*
five one three
Here the shell expands the pattern before the echo command even begins to run. echo never sees the pattern, just arguments "five one three".
locate is capable of expanding patterns, but if you don't quote your pattern then the shell will eat it first. The same happens with find --name
Edit:
A bit of related advice ... if you ever use regular expression from the command line, enclose the entire regex inside single quotes. Leaving a regex exposed to the shell is asking for all kinds of confusion.
Thanks for your reply, I think the last part of your post is the most usefull In the
And, I got "five one three" too.
I wonder though to what could the * in *te* be expanded by the shell and only 29 results (all of them containing te though) but finally I understand the answer is not important
I wonder though to what could the * in *te* be expanded by the shell and only 29 results (all of them containing te though) but finally I understand the answer is not important
It really is important, if you want to learn how to get best possible use of your shell. Try these examples.
Code:
$ mkdir foo
$ cd foo
$ echo *mandb*
*mandb*
Bash expands *mandb* into the set of all file names which match the pattern. (I think this is what is meant by globbing.)
Because no matches are found, *mandb* is passed unchanged to echo.
echo receives the argument *mandb* and prints it to stdout.
Quote:
Originally Posted by Bash Reference Manual, 3.5.8 Filename Expansion
Bash scans each word for the characters ‘*’, ‘?’, and ‘[’. If one of these characters appears, then the word is regarded as a pattern, and replaced with an alphabetically sorted list of file names matching the pattern. If no matching file names are found, and the shell option nullglob is disabled, the word is left unchanged.
Hope you're still reading, because here comes the most important part.
Code:
$ locate *mandb*
/home/me/foo/fake-mandb-file
In the previous example we used pattern mandb, and locate found several files from the /usr/ hierarchy. Why didn't it find them this time?
Bash expands *mandb* into the set of all file names which match the pattern.
The file fake-mandb-file is the one and only match.
The argument fake-mandb-file is passed to locate.
locate searches its database for file names matching fake-mandb-file.
/home/me/foo/fake-mandb-file is the one and only match, representing all of the directories in your filesystem indexed by the locate database.
I hope that now you can see why locate *te* might not return the same results as locate '*te*'. It is crucial to understand when Bash is eating your unquoted wildcard patterns and expanding them into unexpected strings of gibberish.
Hope you're still reading, because here comes the most important part.
Code:
$ locate *mandb*
/home/me/foo/fake-mandb-file
In the previous example we used pattern mandb, and locate found several files from the /usr/ hierarchy. Why didn't it find them this time?
Bash expands *mandb* into the set of all file names which match the pattern.
The file fake-mandb-file is the one and only match.
The argument fake-mandb-file is passed to locate.
locate searches its database for file names matching fake-mandb-file.
/home/me/foo/fake-mandb-file is the one and only match, representing all of the directories in your filesystem indexed by the locate database.
I hope that now you can see why locate *te* might not return the same results as locate '*te*'. It is crucial to understand when Bash is eating your unquoted wildcard patterns and expanding them into unexpected strings of gibberish.
Thank you for your comprehensive reply
I think it makes sense but just to make sure I understood this
When giving the argument te , the bash searches for a file named exactly te, doesnt find any and it passes te unchanged to the command locate which in turn regards it as *te* because of the way locate works, searches its databese (created by updatedb) for *te* and finds a big number of matches since it looks for files containing te in their names
When given as *te* , bash performs the pathname expansion, finds a few matches only, passes these expanded matches to the command locate which of course returns the same number of matches.
So, it is clear why *te* gives different results than te .
My only question is where does the bash search to find files
And also :
I understood that the quoted argument is not usefull with locate when looking in a specific directory
When giving the argument te , the bash searches for a file named exactly te, doesnt find any and it passes te unchanged to the command locate...
This part is incorrect. Since the command line does not contain any special wildcard characters, NO globbing/file matching is attempted at all, and the string te is simply passed straight to locate as-is. Not that it really matters though, as the same simple text string is passed whether there's a match or not.
The rest of your paragraph is correct.
Quote:
My only question is where does the bash search to find files
Globbing is attempted in the current directory only, unless the pattern includes a relative or absolute path. locate ../*te* would attempt to glob in the directory above the one you're in and locate /home/user/*te* would likewise attempt to glob the user's home directory. Globbing also ignores hidden dotfiles by default, but this can be changed with the dotglob shell option.
This part is incorrect. Since the command line does not contain any special wildcard characters, NO globbing/file matching is attempted at all, and the string te is simply passed straight to locate as-is. Not that it really matters though, as the same simple text string is passed whether there's a match or not.
The rest of your paragraph is correct.
Globbing is attempted in the current directory only, unless the pattern includes a relative or absolute path. locate ../*te* would attempt to glob in the directory above the one you're in and locate /home/user/*te* would likewise attempt to glob the user's home directory. Globbing also ignores hidden dotfiles by default, but this can be changed with the dotglob shell option.
Thanks you for your correction about the fact that bash doesnt search for te . So , it is now clear that bash intervenes only to process wildcards , otherwise just passes the argument to locate
But, something peculiar happens here :
Now , when giving
As you see now I get 31812 results instead of 29 giving the same command while being in the same dir (home) and not having changed anything ! (to be absolutely sure I did copy-paste the command from my own post
In my comment nr 7 here, while I was getting only 29 results you see 2 of them ( there were more), not belonging to my home dir (in which I was while giving the command), so it appears that bash did not search only locally
Any idea?
This part is incorrect. Since the command line does not contain any special wildcard characters, NO globbing/file matching is attempted at all, and the string te is simply passed straight to locate as-is. Not that it really matters though, as the same simple text string is passed whether there's a match or not.
The rest of your paragraph is correct.
Globbing is attempted in the current directory only, unless the pattern includes a relative or absolute path. locate ../*te* would attempt to glob in the directory above the one you're in and locate /home/user/*te* would likewise attempt to glob the user's home directory. Globbing also ignores hidden dotfiles by default, but this can be changed with the dotglob shell option.
Thanks you for your correction about the fact that bash doesnt search for te . So , it is now clear that bash intervenes only to process wildcards , otherwise just passes the argument to locate
But, something peculiar happens here :
Now , when giving
As you see now I get 31812 results instead of 29 giving the same command while being in the same dir (home) and not having changed anything ! (to be absolutely sure I did copy-paste the command from my own post)
In my comment nr 7 here, while I was getting only 29 results you see 2 of them ( there were more), not belonging to my home dir (in which I was while giving the command), so it appears that bash did not search only locally
Any idea?
When giving the argument te , the bash searches for a file named exactly te, doesnt find any and it passes te unchanged to the command
No. Bash only performs expansions on unquoted pattern matching characters.
Quote:
Originally Posted by Bash Reference Manual, 3.5.8 Filename Expansion
Bash scans each word for the characters ‘*’, ‘?’, and ‘[’. If one of these characters appears, then the word is regarded as a pattern, and replaced with an alphabetically sorted list of file names matching the pattern. If no matching file names are found, and the shell option nullglob is disabled, the word is left unchanged.
$ locate te
/bin/date
/bin/mktemp
/bin/tempfile
/boot/System.map-2.6.24-23-generic
/boot/System.map-2.6.24-24-generic
/boot/System.map-2.6.24-25-generic
/boot/System.map-2.6.24-26-generic
/boot/System.map-2.6.24-27-generic
/boot/System.map-2.6.24-28-generic
/boot/memtest86+.bin
... (goes on for too many lines to include here) ...
Bash passes the string te unchanged to the locate command because there are no unquoted pattern matching characters in te.
locate receives the argument te and internally transforms it to *te*.
locate returns the set of all file names in its database which match the pattern *te*.
Quote:
Originally Posted by man locate
If any PATTERN contains no globbing characters, locate behaves as if the pattern were *PATTERN*.
When given as *te* , bash performs the pathname expansion, finds a few matches only, passes these expanded matches to the command locate which of course returns the same number of matches.
Try and prove it for yourself.
Code:
~$ mkdir foo
~$ cd foo
~/foo$ touch fake-te-file one two three
~/foo$ ls
fake-te-file one three two
~/foo$ sudo updatedb
~/foo$ locate *te*
/home/me/foo/fake-te-file
~/foo$ echo *te*
fake-te-file
Bash expands *te* into the set of all file names which match the pattern *te*. Only fake-te-file matches.
Bash passes fake-te-file to locate.
locate accepts the argument fake-te-file and internally tranforms it to the pattern *fake-te-file*.
locate searches its database for all file names which match the pattern *fake-te-file*.
locate returns the one and only result, /home/me/foo/fake-te-file
Quote:
where does the bash search to find files
Bash compares strings with unquoted pattern matching characters against the names of files in the current working directory. As David the H. correctly pointed out, Bash also obeys any relative paths you specify like ./subdir/ or ../otherdir/ when matching the pattern.
Code:
~/foo$ mkdir bar
~/foo$ touch bar/one bar/two bar/three
~/foo$ ls bar
one three two
~/foo$ echo bar/*e*
bar/one bar/three
As you see now I get 31812 results instead of 29 giving the same command while being in the same dir (home) and not having changed anything !
I can't really say because I don't know your system. My best guess is that no files in your current working directory match the pattern, so locate receives the argument *te*.
Quote:
Originally Posted by Bash Reference Manual, 3.5.8 Filename Expansion
If no matching file names are found, and the shell option nullglob is disabled, the word is left unchanged.
I suppose it is also conceivable that your system is configured somewhat differently with regards to locate and/or shell options. I can't even guess what those differences might be. In such a case you will have to review the appropriate documentation and configuration files with great intensity to discover the cause. I very much doubt this is the case though.
Thanks again for your detailed reply
I have made clear in my mind that bash intervenes only to do the globbing and that if no globbing is required, it passes the arguments to locate.
Bash compares strings with unquoted pattern matching characters against the names of files in the current working directory. As David the H. correctly pointed out, Bash also obeys any relative paths you specify like ./subdir/ or ../otherdir/ when matching the pattern.
Code:
~/foo$ mkdir bar
~/foo$ touch bar/one bar/two bar/three
~/foo$ ls bar
one three two
~/foo$ echo bar/*e*
bar/one bar/three
I presume this is what should happen and indeed, when giving e.g
I can't really say because I don't know your system. My best guess is that no files in your current working directory match the pattern, so locate receives the argument *te*.
No , it cant be the case because I didnt change anything .Something really strange ins going on :
I suppose it is also conceivable that your system is configured somewhat differently with regards to locate and/or shell options. I can't even guess what those differences might be. In such a case you will have to review the appropriate documentation and configuration files with great intensity to discover the cause. I very much doubt this is the case though.[/QUOTE]
I have a clean installation of Ubuntu 10.10 64 bit and have touched nothing related to configurations of bash .
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.