what character is stronger than the minus or the hyphen character ?
I use the naming convention to help to sort files in a directory.
Under "ls" command linux, I have all the project directories in the following order: Quote:
Now, I would really like to ask "ls" to display a project called 04.07+01-SProject created later than 04.07-SProject got listed below 04.07-SProject, like 04.07-SProject 04.07+01-SProject but under ls, I will get 04.07+01-SProject 04.07-SProject It is a bit complicated but it is crucial for me to keep track of different projects in "chronological" order. The ls put the strength of characters according to _ : + . 012 abcz ABC Z = - I think the solution to this is to use the weaker characters such as _ : + rather than the strongest character - from the very beginning (the weakest one _ has been outlawed by me to make files due to some reason). Or overwrite the alphabatical order controlled by linux convention. I would have like the precedence goes like - = + . 0 12 abc AB C Z Can I change the precedance at will ? I hope my question did not confuse you. This is an academic exercise. |
Using the date as part of the name makes it easy to list files/directories in chronilogical order. I've done this for many years.
yearmonthday_data_file i.e 20150424-data_file or 20150415-4.07-SProject 20150416-4.07+01-SProject |
ls has a lot of different sorting options, like -t, -c or --sort
|
michaelk:
The restriction is I do not want to change the original directory names, which contains large files. I have input files in many other places that refer to the original directory. If I change it I have to replace the directory in name in the input in many places. [linux]$ ls -1 070-SProject 07ASProject 07-SProject 07--SProject 07SSProject 07-T02-SProjec Now I do not understand why - should come between A and S. What is the logic? It broke my observation in the first post. I was hoping to use -- to make it stronger than -S in sorting. |
You may have "invisible" characters inside filenames (try ls | od -xc to check it). Also you can try to choose another locale settings to modify sorting behavior.
|
No visible thing:
A- AA AS B-- B-A B-B E-04.07ASb E-04.07-Sb E-04.07--Sb E-04.07SSb test it yourself if you may. |
Tested on linux mint 14 and Centos 6. same thing.
|
I cannot see what you made. Please write all your commands as they were executed and the full result.
|
Even more bizzare results:
$ ls -1 07ASb 07-Sb 07--Sb 07SSb A- AA aASb AS a-Sb a--Sb aSSb B-- B-A B-B C--A C-AA C-ZA ZASb Z-Sb Z--Sb ZSSb now "a" appears between an "A" and another "A". |
Oops... think before typing...
My man page says it sorts alpha if no other sort options specified... What does which ls return? (i.e., does your shell have a built-in with different behavior). Also, many distros alias ls, is there an alias that changes the expected sort behavior? |
good question, I used ls -1 for many years but never notice anything bizzare.
Anyway, on the IBM machine, I have % /usr/bin/ls -1 07--Sb 07-Sb 07ASb 07SSb A- AA AS B-- B-A B-B C--A C-AA C-ZA Z--Sb Z-Sb ZASb ZSSb a--Sb a-Sb aASb aSSb Seems IBM got it right. |
astrogeek:
I just issue ls as without any customization. :) I bet if you cut and paste my directory names in a file and run a script to make the directories, I hope you see what I see. So for IBM AIX gives the right sort. All other Linux that I have tested all seem to (even on an IBM machine installed with Linux) give: $ uname -a Linux 2.6.32-279.el6.ppc64 #1 SMP Wed Jun 13 18:19:27 EDT 2012 ppc64 ppc64 ppc64 GNU/Linux $ ls -1 07ASb 07-Sb 07--Sb 07SSb A- AA aASb AS a-Sb a--Sb aSSb B-- B-A B-B C--A C-AA C-ZA ZASb Z-Sb Z--Sb ZSSb |
i issued /bin/ls -1 and so far none linux OSes gives the consistent expected result.
So I shall leave this to other to show me how silly I have been or let the community fix this "bug", if there is any. I repeat the test I have done: $ uname -a Linux centos61 2.6.32-131.0.15.el6.x86_64 #1 SMP Sat Nov 12 15:11:58 CST 2011 x86_64 x86_64 x86_64 GNU/Linux $ /bin/ls -1 07ASb 07-Sb 07--Sb 07SSb A- AA aASb AS a-Sb a--Sb aSSb B-- B-A B-B C--A C-AA C-ZA ZASb Z-Sb Z--Sb ZSSb |
The answer lies in the setting of the LC_COLLATE shell variable.
With LC_COLLATE=C Code:
bash-4.3$ LC_COLLATE=C; ls -1 Code:
bash-4.3$ LC_COLLATE=en_US.utf8; ls -1 |
On my bash, I need an export.
But now, this raises a question, why does en_US.utf8 (seems a default) gives an order of "a" between 2 "A"s ? Is there a locale that first list "a", then "A", then "b", then "B"? The "export LC_COLLATE=C" put "a" all the way back to all capital letters, that's why it may be scientifically correct, but useless for a common person, who prefers to see "a" and "A" files list close to one another. I won't use LC_COLLATE=C for this reason but to bear the weird quirks I have seen in the first post. My theory is that some operating systems treat files with names "a" and "A" the same, that's why it is causing the sorting algorithm to produce results that is hard to predict. I am interested to know who can provide an explanation to the sorting offered by en_US.utf8. One way to go about it is to write a "bash -1" command myself and then do a python to display the order I want, rather to be utterly confused by the standard that may be understood by a handful few. Now, the results on my system: Note, this is not working. Quote:
Quote:
|
All times are GMT -5. The time now is 05:33 AM. |