LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 01-23-2006, 07:03 PM   #1
deleted/
LQ Newbie
 
Registered: Sep 2005
Location: Ireland
Distribution: FC5
Posts: 14

Rep: Reputation: 0
Sorting files in BASH


I've been trying to find an answer to this all night with no luck. Maybe i've been looking in the wrong place. Anyway is the a way in BASH to sort files by upper or lowercase letter.

For example, list all files beginning with an uppercase.

Any help would be appreciated.
 
Old 01-23-2006, 07:16 PM   #2
IBall
Senior Member
 
Registered: Nov 2003
Location: Perth, Western Australia
Distribution: Ubuntu, Debian, Various using VMWare
Posts: 2,088

Rep: Reputation: 62
See "man ls".

By default, ls sorts in alphabetical order ignoring the case of the file name.

To list all files starting with a capital letter, you need to use a regular expression, such as (I haven't tested this): "ls [A-Z]*"

I hope this helps
--Ian
 
Old 01-24-2006, 07:21 AM   #3
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 38
hmm, "ls [A-Z]*" does not work.
I'd think it'd be something like: ls --ignore [a-z]*
but that does not work as well ..
 
Old 01-24-2006, 07:45 AM   #4
Dtsazza
Member
 
Registered: Oct 2005
Location: Oxford, UK
Distribution: Debian Etch (w/ dual-boot XP for gaming)
Posts: 282

Rep: Reputation: 31
I think you'd be better off sticking the output from ls into a pipe and letting another program sort it. There is, of course, the sort utility for sorting, but I can't see any concept of character case (from its man page at least), and you already have a case-insensitive sort. The first thing that comes to mind is grep - a simple pipe like
Code:
ls | grep ^[A-Z]
will acheive what you want. Apparently grep is pretty resource intensive, and I doubt you need its power for something like this, so there are probably better tools. sed and awk also spring to mind - though sed would basically be emulating grep as in
Code:
ls | sed -n 's/^[A-Z]/&/p'
and awk is probably even more intensive than grep.

Anyway, that's really just an aside as I don't think a standard desktop would notice the performance hit of grepping the output from ls. Though I'll admit it would be nicer to specify an argument to ls itself, so you can play around with columns (either manually or by specifying the -l argument) without having to worry about your output format. If it's important that you can do this, it shouldn't be too tough to write a script that looks at its input to work out what columns it has in what order, and sorts on the filename.
 
Old 01-24-2006, 08:47 AM   #5
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 38
@Dtsazza: that's weird, your first solution does the same as for me as: ls
The second one works as supposed though.
Have you tried that first one yourself?
 
Old 01-24-2006, 02:25 PM   #6
deleted/
LQ Newbie
 
Registered: Sep 2005
Location: Ireland
Distribution: FC5
Posts: 14

Original Poster
Rep: Reputation: 0
Thanks for the input guys. Dtsazza the grep pipe is perfect cheers.

I suppose it will take a while to move from gui based thinking to shell style thinking.

Thanks again.
 
Old 01-25-2006, 04:36 AM   #7
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 38
can someone tell how i can find out why this grep command is not working for me:?
(just trying to learn something here ..)
Code:
$ ls
dref.txt  NEW.TXT

$ ls | grep ^[A-Z]
dref.txt
NEW.TXT

$ ls | grep ^[a-z]
dref.txt
NEW.TXT

$ ls | grep ^[0-9]
(gives nothing, so that works)
thanks! I copied it from the commandline, so it can't be typo's.
 
Old 01-25-2006, 05:07 AM   #8
sohny
Member
 
Registered: Aug 2004
Location: bangalore
Distribution: Redhat,Ubuntu
Posts: 64

Rep: Reputation: 15
But those work perfetcly. Those commands worked for me properly in RedHAT 9 with bash
 
Old 01-25-2006, 05:59 AM   #9
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 38
yeah, that was why i was wondering why this is not working in my shell :?
cat /etc/passwd shows i'm running /bin/bash
I'm working under suse 10.0
If somebody knows how i can find out why
Code:
$ ls | grep ^[A-Z]
does not work for me (see post above), please let me know
 
Old 01-25-2006, 08:15 AM   #10
Dtsazza
Member
 
Registered: Oct 2005
Location: Oxford, UK
Distribution: Debian Etch (w/ dual-boot XP for gaming)
Posts: 282

Rep: Reputation: 31
That is rather strange, and I tested the command as working before posting it (always good to check, if only for typos). Looking it your command, I can't see why it wouldn't work... it rather straightforward-ly says "OK, match the start of the line, followed by one of the characters A..Z".

As a test, can you try
Code:
$ ls | grep ^N
and
Code:
$ ls | grep [A-Z]
to see if it's the caret or the range that's causing the problem (both of these should just output NEW.TXT)? Also, it might sound obvious, but is NEW.TXT all uppercase ASCII characters? If you're using a Unicode shell and the filename was generated in some way other than you typing in it's name yourself, there's a chance the characters won't be in the range U0041-U005A ('normal' uppercase characters) but will look the same. Yep, it's unlikely, but then that grep really should be working...
 
Old 01-25-2006, 08:47 AM   #11
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 38
it's still weird, the minus in ^[A-Z] doesn't work properly.
Code:
$ ls -all
total 20
drwxr-xr-x  2 user users 168 2006-01-25 14:52 .
drwxr-xr-x  4 user users 248 2006-01-24 18:29 ..
-rw-r--r--  1 user users   8 2006-01-25 14:52 AAZ.TXT
-rw-r--r--  1 user users  26 2006-01-24 18:51 dref.txt
-rw-r--r--  1 user users  26 2006-01-25 14:52 dret.txt
-rw-r--r--  1 user users   8 2006-01-24 18:29 NEW.TXT
-rw-r--r--  1 user users  26 2006-01-25 14:52 xret.txt
$ ls | grep ^N
NEW.TXT
$ ls | grep [A-Z]
AAZ.TXT
dref.txt
dret.txt
NEW.TXT
xret.txt
$ ls | grep ^[A-Z]
AAZ.TXT
dref.txt
dret.txt
NEW.TXT
xret.txt
$ ls | grep ^[N]
NEW.TXT
$ ls | grep ^[NZ]
NEW.TXT
$ ls | grep ^[N-Z]
NEW.TXT
xret.txt
$ echo $SHELL
/bin/bash
$ which bash
/bin/bash
whereas with sed it does work:
Code:
$ ls | sed -n 's/^[A-Z]/&/p'
AAZ.TXT
NEW.TXT
$ ls | sed -n 's/^[a-z]/&/p'
dref.txt
dret.txt
xret.txt

Last edited by muha; 01-25-2006 at 08:49 AM.
 
Old 01-25-2006, 01:50 PM   #12
Dtsazza
Member
 
Registered: Oct 2005
Location: Oxford, UK
Distribution: Debian Etch (w/ dual-boot XP for gaming)
Posts: 282

Rep: Reputation: 31
That's really weird. My first thought was that maybe it wasn't recognising the dash as a special character - but it is more than just a literal, because of the difference between 'grep ^[NZ]' and 'grep ^[N-Z]'. It seems that it's interpreting it as a case-insensitive range.

My thoughts are that perhaps your shell's locale is defining some kind of default search order that either makes all ranges case-insensitive, or somehow inserts a-z between A and Z (less likely). But then, it's weird that sed doesn't use the same information to process its own regexes. Besides, I'm no good at locales, so moving swiftly on...

I think a
Code:
grep -V
could be interesting at this point.
 
Old 01-25-2006, 02:29 PM   #13
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 38
thanks for thinking along. I'm still just trying to make sense of linux so ..
I have no real problem atm but am just trying to learn.

Code:
$ grep -V
grep (GNU grep) 2.5.1

Copyright 1988, 1992-1999, 2000, 2001 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
according to 'man setlocale', LC_COLLATE="en_US.UTF-8" and LC_CTYPE="en_US.UTF-8"
are the ones defing regexp. But they look pretty standard to me. Maybe LC_ALL= should also be defined :?

/edit: ahhh! It is indeed LC_ALL which should be set to LC_ALL=C
as explained here: http://www.linuxquestions.org/questi...hreadid=327916

/edit: to change this setting to LC_ALL=C do
Code:
$ export LC_ALL='C'

$ ls -all
total 20
drwxr-xr-x  3 user users 200 Jan 25 21:21 .
drwxr-xr-x  4 user users 248 Jan 24 18:29 ..
-rw-r--r--  1 user users   8 Jan 25 20:59 AAZ.TXT
-rw-r--r--  1 user users   8 Jan 25 20:59 NEW.TXT
-rw-r--r--  1 user users  26 Jan 25 20:59 dref.txt
-rw-r--r--  1 user users  26 Jan 25 20:59 dret.txt
drwxr-xr-x  2 user users 176 Jan 25 20:59 new
-rw-r--r--  1 user users  26 Jan 25 20:59 x_aet.txt
$ ls |grep ^[A-Z]
AAZ.TXT
NEW.TXT
Thanks to Dtsazza for the hints! Funny to see that the behaviour of ls -all has changed accordingly to the LC_ALL setting.
And also that suse 10.0 has LC_ALL= not set to C as standard for me.

Last edited by muha; 01-25-2006 at 03:03 PM.
 
Old 01-25-2006, 04:36 PM   #14
Dtsazza
Member
 
Registered: Oct 2005
Location: Oxford, UK
Distribution: Debian Etch (w/ dual-boot XP for gaming)
Posts: 282

Rep: Reputation: 31
Wow... I didn't know that myself, so thanks for taking my vague hints and making them into something workable! I've got the exact same version of grep so it really must be something else. And my LC_ALL variable is also unset, so it wasn't that.

Still, if we've got something that sorts in 'ls' itself, that's much better than piping to grep, both from a resources point of view and so you can play around with ls switches and still get what you want.
 
Old 01-26-2006, 03:35 AM   #15
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 38
Besides LC_ALL it can be other 'locale' settings as well: from man grep:
Quote:
A locale LC_foo is specified by examining the three environment variables LC_ALL, LC_foo, LANG, in that order. The first of these variables that is set specifies the locale. For example, if LC_ALL is not set, but LC_MESSAGES is set to pt_BR, then Brazilian Portuguese is used for the LC_MESSAGES locale. The C locale is used if none of these environment variables are set, or if the locale catalog is not installed, or if grep was not compiled with national language support (NLS).
Pretty complex with all the or, or, ors
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
HELP, sorting files by name with environment ar3ol Linux - Newbie 6 12-05-2005 04:03 PM
bash help renaming files kahn Programming 6 06-16-2005 07:15 AM
making .tif files into animated .gif files (bash shell, Red Hat 7.2) illiniguy3043 Linux - Newbie 1 06-01-2004 04:04 PM
Sorting files in Konqueror/File associations/sound in KDE Kelvie Linux - Newbie 9 07-19-2003 01:56 AM
bash files ChimpFace9000 Linux - General 1 08-23-2002 09:24 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 02:52 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration