LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Learning Bash, potential bug or total misunderstanding regarding globbing. (https://www.linuxquestions.org/questions/linux-newbie-8/learning-bash-potential-bug-or-total-misunderstanding-regarding-globbing-702930/)

crispyleif 02-07-2009 07:57 AM

Learning Bash, potential bug or total misunderstanding regarding globbing.
 
me@marsar-laptop:~/test$ ls [A-Z]* -dl
-rw-r--r-- 1 happyhd happyhd 0 2009-02-07 14:44 file
-rw-r--r-- 1 happyhd happyhd 0 2009-02-07 14:47 File
-rw-r--r-- 1 happyhd happyhd 0 2009-02-07 14:49 file5file
drwxr-xr-x 4 happyhd happyhd 4096 2009-02-07 14:36 newdir
me@marsar-laptop:~/test$

Why am I seeing files starting with lowercase letters here ?

JulianTosh 02-07-2009 08:37 AM

I believe globbing is, by default, case insensitive... if you want it sensitive, try "unsetopt CASE_GLOB"

jschiwal 02-07-2009 08:43 AM

I'm not certain, but also look at your locale settings.

ilikejam 02-07-2009 10:49 AM

If your shell glob matches a directory, you'll get a listing of the _contents_ of that directory. Are you sure those listed files are in your current dir?

Edit: never mind - missed the 'd' in '-dl'

Dave

ahc_fan 02-07-2009 11:11 AM

Never noticed this before. Seem like a bug to me because it only happens with ranges, unless it's like that person says it has something to do with locale settings- not sure what he means though. Case sensitivity can be set with shopt command 'shopt -u nocaseglob'.

sycamorex 02-07-2009 11:22 AM

This does not answer your question, but just to let you know that:

Quote:

ls [[:upper:]]* -dl
works fine.

colucix 02-07-2009 11:37 AM

This is not a bug and the following solves the issue:
Code:

env LC_COLLATE=C ls -d [A-Z]*
As already mentioned by jschiwal, it is a problem with locale settings. In particular the LC_COLLATE sequence determines the behavior of [A-Z] intervals. Many locale settings collate the upper and lower case letters in this way:
Code:

AaBbCc...XxYyZz
so that the interval A-Z matches all the letters except the last lower case "z". Other settings collate the sequence as
Code:

aAbBcC...xXyYzZ
so that the interval [A-Z] matches all letters except the first lower case "a". You can demonstrate this if you have a file beginning with "a" and a file beginning with "z" in the current directory. The command
Code:

ls -d [A-Z]*
will miss one of this two!

colucix 02-07-2009 11:39 AM

Quote:

Originally Posted by sycamorex (Post 3435282)
This does not answer your question, but just to let you know that:

Code:

ls -ld [[:upper:]]*
works fine.

Given the explanation above, this is without any doubt the most portable way to match upper OR lower case letters. 1000 points to sycamorex for this great tip! :)

vasmakk 02-07-2009 11:48 AM

Quote:

Originally Posted by crispyleif (Post 3435118)
...
Why am I seeing files starting with lowercase letters here ?

Hi crispyleif!

Case sensitivity works in an ASCII character set. At least that is what I read in Richard Petersen's book "Linux: The complete Reference 6th Ed."
I don't think your distro (like mine also..) uses ASCII character sets but Unicode.

Vas

vasmakk 02-07-2009 11:55 AM

Quote:

Originally Posted by colucix (Post 3435295)
This is not a bug and the following solves the issue:
Code:

env LC_COLLATE=C ls -d [A-Z]*
As already mentioned by jschiwal, it is a problem with locale settings. In particular the LC_COLLATE sequence determines the behavior of [A-Z] intervals. Many locale settings collate the upper and lower case letters in this way:
Code:

AaBbCc...XxYyZz
so that the interval A-Z matches all the letters except the last lower case "z". Other settings collate the sequence as
Code:

aAbBcC...xXyYzZ
so that the interval [A-Z] matches all letters except the first lower case "a". You can demonstrate this if you have a file beginning with "a" and a file beginning with "z" in the current directory. The command
Code:

ls -d [A-Z]*
will miss one of this two!

And also thanks colucix!
for the most complete explanation ...
Books are never enough
Long live the community!!!

Vas

norobro 02-07-2009 12:13 PM

Interesting thread containing some good info.

While trying the above commands I uncovered some strange behavior:
Code:

norm  ~: /bin/ls -l |grep ^d
drwxr-xr-x  2 root root      4096 2006-12-03 22:08 apache2
drwxr-xr-x  7 norm norm      4096 2009-02-01 12:22 c++
drwxr-xr-x  2 norm norm      4096 2009-02-02 16:02 Desktop
drwxr-xr-x  2 norm norm      4096 2009-01-04 14:59 games
drwxr-x---  4 norm norm      4096 2007-03-25 18:22 GNUstep
drwxr-xr-x 10 norm norm      4096 2008-02-06 14:00 java
drwxr-xr-x  2 norm norm      4096 2008-10-24 19:36 mcthemes
drwxr-xr-x  2 norm norm      4096 2008-01-11 19:53 passwords
drwxr-xr-x  3 norm norm      4096 2008-12-04 19:13 perl
drwxr-xr-x  2 norm root      4096 2007-06-26 16:54 suddenlink
drwxr-xr-x  2 norm norm      4096 2006-11-30 18:53 tcl
drwxr-xr-x  2 norm norm      4096 2009-02-05 16:26 temp
drwxr-xr-x  2 norm norm      4096 2009-01-14 19:00 vids
drwxr-xr-x  4 norm norm      4096 2009-01-29 18:10 wallpapers
drwxr-xr-x  2 norm norm      4096 2009-01-15 20:16 winprograms
drwxr-xr-x  5 norm norm      4096 2007-02-16 17:37 xml

Code:

norm  ~: /bin/ls -ld
drwxr-xr-x 90 norm norm 4096 2009-02-07 07:55 .

I'm using the full path to ls because I have some aliases set.

Isn't the -d switch supposed to provide the same data as the first listing?

colucix 02-07-2009 12:47 PM

Nope. Because ls without arguments refers to the current directory. If you use the -d option the content of the current directory is not listed. To me this is the expected behaviour.

Tip of the day: to run a command ignoring its alias, you can "escape" it using backslash:
Code:

\ls
this saves some typing.

norobro 02-07-2009 01:33 PM

Thanks for the "backslash" tip, it does save some typing.
But I'm still confused about the -d switch.
Code:

norm  ~: \ls -ld t*
drwxr-xr-x 2 norm norm  4096 2006-11-30 18:53 tcl
drwxr-xr-x 2 norm norm  4096 2009-02-05 16:26 temp
-rwxr-xr-x 1 norm norm  119 2009-01-29 11:09 test.sh
-rw-r--r-- 1 norm norm 68597 2007-12-18 08:16 tgs_check_image.jpg

Can you give me an example of how it is supposed to be used?

crispyleif 02-07-2009 01:50 PM

Glad to read all this great feedback/info - as someone wrote here :

Long live the community!

My laptop is running Kubuntu and I'm suspecting that unicode and not ASCII is used.

In curiousity, I ran over to the neighbour and tested her mac (bash v2.05) and the original command worked fine there.

[[:upper:]] works as a charm ! Thx :)
env LC_COLLATE=C ls -d [A-Z]* doesn't

Btw, '' (two single quotes) can also be used to escape an alias

colucix 02-07-2009 02:06 PM

Quote:

Originally Posted by norobro (Post 3435434)
Can you give me an example of how it is supposed to be used?

Exactly the way you used it in your last example, using a wildcard. Suppose you have a directory containing files and directories:
Code:

$ ls -ld /path/to/dir
drwxr-xr-x 6 alex users 4096 2009-01-25 00:52 /path/to/dir
$ ls -ld /path/to/dir/*
drwxr-xr-x 2 norm norm  4096 2006-11-30 18:53 tcl
drwxr-xr-x 2 norm norm  4096 2009-02-05 16:26 temp
-rwxr-xr-x 1 norm norm  119 2009-01-29 11:09 test.sh
-rw-r--r-- 1 norm norm 68597 2007-12-18 08:16 tgs_check_image.jpg

The first command does not expand to the content of /path/to/dir because of the -d option. The second one expands to the content of /path/to/dir/*, that is one level down because the wildcard is previously expanded by the shell. It equals to the following command:
Code:

ls -ld /path/to/dir/tcl /path/to/dir/temp /path/to/dir/test.sh /path/to/dir/tgs_check_image.jpg
anyway the content of the directories /path/to/dir/tcl and /path/to/dir/temp is not shown (as it would be without the -d option).


All times are GMT -5. The time now is 08:34 AM.