LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-13-2012, 10:44 PM   #1
bartonski
Member
 
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Blog Entries: 1

Rep: Reputation: 48
Don't use 'ls' in shell scripts... why?


Long long ago in a thread far away, I remember one of the LQ gurus recommending that 'ls' should not be used for programmatic purposes. There were several justifications given.

I've used this as a rule of thumb ever since... but I don't remember why...

I can see it in a case like this, where launching 'ls' would spawn a separate process...

Code:
for i in *.txt
do
    echo $i
done
I think that the argument was more along the lines that the output of 'ls' is unstable in some way.

Am I imagining this?
 
Old 04-13-2012, 11:36 PM   #2
Asido
Member
 
Registered: Jan 2010
Location: Denmark
Distribution: Gentoo, Archlinux, FreeBSD, Slackware
Posts: 84

Rep: Reputation: 24
You never should parse `ls` output if you are not the only who is going to use the script. People make aliases in example with `-F` flag which appends extra characters. Your script will simply be broken.
 
1 members found this post helpful.
Old 04-14-2012, 12:24 AM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
See if this helps:

http://mywiki.wooledge.org/ParsingLs
 
1 members found this post helpful.
Old 04-14-2012, 11:07 AM   #4
bartonski
Member
 
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443

Original Poster
Blog Entries: 1

Rep: Reputation: 48
Quote:
Originally Posted by grail View Post
The link was exactly what I needed.

Funny thing is that I knew 90% of what was there... I just couldn't articulate it. Good to have it one place.
 
Old 04-14-2012, 01:11 PM   #5
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
Quote:
Originally Posted by Asido View Post
You never should parse `ls` output if you are not the only who is going to use the script. People make aliases in example with `-F` flag which appends extra characters. Your script will simply be broken.
But aliases are disabled by default in bash scripts. From the GNU bash reference section on aliases: "Aliases are not expanded when the shell is not interactive, unless the expand_aliases shell option is set using shopt". That doesn't mean it's OK to parse ls output; it just means aliases are not a reason for avoiding it in scripts.
 
1 members found this post helpful.
Old 04-14-2012, 02:13 PM   #6
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
My take on the matter is that it relies on the consistent behavior, over time, of another application. As soon as that behavior changes, which happens often enough, then your script breaks. Add to that, that it is simply wasteful, when bash already has the facility to access the filesystem and it's structure. Why add one or more extra steps that, at best, serve only to get in the way?

--- rod.
 
Old 04-14-2012, 03:27 PM   #7
bartonski
Member
 
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443

Original Poster
Blog Entries: 1

Rep: Reputation: 48
Quote:
Originally Posted by theNbomr View Post
My take on the matter is that it relies on the consistent behavior, over time, of another application. As soon as that behavior changes, which happens often enough, then your script breaks. Add to that, that it is simply wasteful, when bash already has the facility to access the filesystem and it's structure. Why add one or more extra steps that, at best, serve only to get in the way?

--- rod.
Well... there are times when glob expansion won't do the trick in terms of finding files... sometimes you have to resort to using 'find'. I agree that the overlap between shell globs and what you can do with 'ls' is pretty large, and in those cases, you should simply use the glob. In the case of 'ls -r -d' (recursively list directory entries), the case for using bash alone isn't nearly as clear cut... that's where you reach for 'find', because 'ls' isn't up to the job.

In terms of having the behaviour of the program change... that's why there are standards. Sticking with standards compliant behaviour should ensure that you get the same thing every time.

Last edited by bartonski; 04-14-2012 at 03:31 PM.
 
Old 04-15-2012, 06:28 AM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
I think I understand where you are coming from but maybe next time you should pick a better example:
Code:
$ ls -r -d
.
I am pretty sure bash could handle this one
 
Old 04-15-2012, 10:39 AM   #9
ta0kira
Senior Member
 
Registered: Sep 2004
Distribution: FreeBSD 9.1, Kubuntu 12.10
Posts: 3,078

Rep: Reputation: Disabled
I don't think differences in versions are the main reason; if that was the case, people would recommend against sed, find, ps, tar, etc. The two main reasons I can think of are:
  1. ls doesn't expand *; the shell does, then it passes the names to ls, which echoes those that exist on one line at a time (when stdout isn't a tty.) If you for file in *.txt, though, you'll iterate once with "*.txt" if there aren't any matching files. And if files have spaces in the names, those names will get split. So it's almost like you get some sort of "validation" by using ls *.txt | while read file.
  2. ls treats directories and files differently by default, which can ruin things, but the -d option helps.
Kevin Barry

PS In case it wasn't clear, this is one vote for "it's fine to use ls in a script." ls -l is a different story, though.

Last edited by ta0kira; 04-15-2012 at 10:51 AM.
 
Old 04-15-2012, 11:17 AM   #10
bartonski
Member
 
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443

Original Poster
Blog Entries: 1

Rep: Reputation: 48
Quote:
Originally Posted by grail View Post
I think I understand where you are coming from but maybe next time you should pick a better example:
Code:
$ ls -r -d
.
I am pretty sure bash could handle this one
Ok, I'm missing something... how does bash handle directory recursion natively?
 
Old 04-15-2012, 11:28 AM   #11
bartonski
Member
 
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443

Original Poster
Blog Entries: 1

Rep: Reputation: 48
Quote:
Originally Posted by ta0kira View Post
PS In case it wasn't clear, this is one vote for "it's fine to use ls in a script." ls -l is a different story, though.
Hm. That's an interesting take... I'll have to think about that.
 
Old 04-15-2012, 11:52 AM   #12
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Quote:
Ok, I'm missing something... how does bash handle directory recursion natively?
My point is not that bash handles recursion (although you can write a recursive function??), but rather that the demonstrated output of the dot directory by the command you supplied
is trivial for bash to provide.

I also agree in part with ta0kira that using ls on its own may not necessarily be dangerous, assuming that the following portion:
Quote:
And if files have spaces in the names, those names will get split.
Is aimed at ls in the for loop scenario and not globbing. That being said, for the things that can go wrong with different switches being used, and not just -l, that I personally
try to steer clear of using it at all.
 
Old 04-15-2012, 04:18 PM   #13
tuxdev
Senior Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 2,012

Rep: Reputation: 115Reputation: 115
Quote:
I don't think differences in versions are the main reason; if that was the case, people would recommend against sed, find, ps, tar, etc.
It's really quite a huge reason, though. sed, find, and tar have POSIX mandated behavior (though tar has some weirdities). ps is actually not recommended for largely the same reasons as ls: it's meant for *humans*, and completely inappropriate for scripts. psgrep is the script-safe way of going about most things you would want to use ps for.
 
Old 04-15-2012, 05:44 PM   #14
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
Quote:
Originally Posted by ta0kira View Post
this is one vote for "it's fine to use ls in a script."
Unless, of course, you have file names with newlines in them, in which case you only get partial file names if you parse the output.

Weird file names are nearly not as rare as one might think. You can create one very easily by accident, for example typing
Code:
touch Isn't it cool?
touch Yes, it's cool.
which gives you two files, one named
Code:
Isnt it cool?
touch Yes, its
and one named
Code:
cool.
A Bash loop,
Code:
shopt -s nullglob
for FILE in * ; do
    ...
done
will handle all names correctly. So does
Code:
shopt -s nullglob
FILESDIRS=(*)
DIRSONLY=(*/)
which collects all files and directory names in current directory (except, by default, those that start with a dot) into array FILESDIRS, and only directory names into array DIRSONLY.

The shopt -s nullglob tells Bash that non-matching glob patterns expand to nothing; the default is to expand to the pattern itself. It is enough (and a good practice) to set it once at the start of the script. (To be honest, I always forget it, and end up testing if FILE exists within the loop..)

I normally use
Code:
LANG=C LC_ALL=C
find DIR(s)... -type f -print0 | while IFS="" read -rd '' FILE ; do
    ...
done
myself, or in fact, the -printf '...%p\0' variant, which allows me to extract not only the file names (no matter what characters they might have), but also file timestamp, size, and/or access mode, at the same time, with just Bash string operations.
It is the safest and most robust way I know of. If you need to compute something and yield it to the script outside the loop, you should use
Code:
LANG=C LC_ALL=C
while IFS="" read -rd '' FILE ; do
    ...
done < <(find DIR(s)... -type f -print0)
because in the former example, the while loop is a subshell (right side of a pipe), and thus any changes it makes are never propagated to the parent, the actual shell running the script. This latter form creates or uses a (temporary) pipe to supply the input to the while loop. As the while loop is run in the original shell, not a subshell, the changes it makes to Bash variables are visible outside the loop too.

While you can technically do a recursive directory walk in Bash, it is nigh impossible to do safely, because parent directory names may change mid-walk. find won't get confused by that; you only may see file names that are no longer there.

Note: I haven't used the Bash 4 **/ notation David the H. mentions below. It should work just as well as find does I think, as long as you set the proper Bash shell options.

Last edited by Nominal Animal; 04-16-2012 at 01:48 AM. Reason: Need 'while IFS="" read' ... to avoid a Bash bug.
 
2 members found this post helpful.
Old 04-16-2012, 01:12 AM   #15
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Bash from version 4 up has a new globstar feature for recursive globbing.

Code:
shopt -s globstar	#it's not enabled by default.

printf '%s\n' **	#lists all files and directories recursively.

printf '%s\n' **/	#lists directories only.

printf '%s\n' **/*.txt	#lists all .txt files recursively.
So for the most part, just prefix your glob with "**/" to make it recursive. You may need to play with dotglob and the GLOBIGNORE variable if you intend to work with hidden files.

Of course you'll still need to use find for more advanced matches, such as by mtime.
 
3 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Curious as to why people don't sell their shell scripts vinvar30 Linux - Software 9 07-04-2011 08:23 AM
[SOLVED] Startup scripts don't get executed gusblake Fedora 2 06-23-2010 04:50 PM
Don't see the old familiar startup scripts folders rc0.d etc. philnk Zenwalk 1 12-28-2008 04:30 AM
How to ssh from a shell script ? For ppl who can write shell scripts. thefountainhead100 Programming 14 10-22-2008 06:24 AM
java scripts don't load properly MauricioTulua Linux - Software 1 09-17-2004 02:51 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:04 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration