LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-05-2006, 09:16 PM   #1
ahz
Member
 
Registered: Oct 2004
Posts: 58

Rep: Reputation: 15
Natural sort with just Bash or core Linux commands?


How do you sort a directory in the natural sort order just using Bash and core Linux utilities (such as sort and ls)? (In other words, no C, C++, PHP, Perl, etc.)

In natural sort, the file names img0.jpg img1.jpg img10.jpg should appear
img0.jpg
img1.jpg
img10.jpg

Instead of the standard sort or even dictionary sort (not helpful):
img0.jpg
img10.jpg
img1.jpg
 
Old 12-05-2006, 09:46 PM   #2
fordeck
Member
 
Registered: Oct 2006
Location: Utah
Posts: 520

Rep: Reputation: 61
Here is an example:

Quote:
$ ls *txt
sort.txt
$
$
$ cat sort.txt
img0.jpg
img10.jpg
img1.jpg
$
$
$ sort -k1.4,1.5 sort.txt
img0.jpg
img1.jpg
img10.jpg
$
Let me know if that works for you.

Regards,
Fordeck

Last edited by fordeck; 12-05-2006 at 09:47 PM.
 
Old 12-05-2006, 09:49 PM   #3
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
Rename your files to have leading 0's in the name so that dictionary sort order IS natural sort order.</cheap shot>

Seriously though, I'd like to know this too.
 
Old 12-05-2006, 09:51 PM   #4
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
fordeck: that's nice, but what if your list is like this:
Code:
file0.dat
file10.dat
file1.dat
bananas0.dat
bananas10.dat
bananas1.dat
...and you want to do it in one go?
 
Old 12-05-2006, 10:18 PM   #5
ahz
Member
 
Registered: Oct 2004
Posts: 58

Original Poster
Rep: Reputation: 15
Matthew is correct. The program should automatically be able to handle any kind of file such as 123.jpg, Picture123.jpg, img123.jpg, etc. These algorithms exist in C, but as I wrote, I need Bash.

My main purpose of this original question is to make it easier to create DVD slide shows from OpenOffice.org Impress (and PowerPoint) using a nice, existing program written in Bash.
https://sourceforge.net/tracker/?fun...roup_id=100188
http://www.oooforum.org/forum/viewtopic.phtml?t=45483
 
Old 12-05-2006, 10:31 PM   #6
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
After googling about a bit I can't find a Free Software tool for this. If you have a C program already written, consider releasing it under the GPL.

Seems like a massive omission though. I'm surprised there's no option in the GNU sort program.
 
Old 12-05-2006, 11:10 PM   #7
tuxdev
Senior Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 2,012

Rep: Reputation: 115Reputation: 115
If you've got C, shouldn't it be possible (if painful) to port that to Bash? If you don't want to do the work yourself for whatever reason, then please release the source so that maybe somebody else here can work on it.

I suppose it was omitted because it never came up as an issue. I think most would take the <cheap shot/> approach, or didn't care what order things were in. There's a secondary advantage to <cheap shot/>, it is a lot prettier to ls because everything is nicely aligned.
 
Old 12-05-2006, 11:18 PM   #8
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
Quote:
Originally Posted by tuxdev
If you've got C, shouldn't it be possible (if painful) to port that to Bash? If you don't want to do the work yourself for whatever reason, then please release the source so that maybe somebody else here can work on it.
There's a Perl module on CPAN, so as long as we're happy making a program to slurp up all the input into memory and sort it that way it's a trivial matter to use the module. Maybe the utility could be called "natsort". Making an efficient program to sort huge files which don't fit in memory is another matter. One would hope that adding the feature to GNU sort would just be a matter of having a comparison function, and adding the command line option.

Quote:
Originally Posted by tuxdev
I suppose it was omitted because it never came up as an issue. I think most would take the <cheap shot/> approach, or didn't care what order things were in. There's a secondary advantage to <cheap shot/>, it is a lot prettier to ls because everything is nicely aligned.
Well, sort of. After the first time I came across this sort of thing, I made sure all my files has zero-padded numbers in their names, but it's not always the case that the person doing the sorting has control over the names of the files / input data.
 
Old 12-06-2006, 12:16 AM   #9
tuxdev
Senior Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 2,012

Rep: Reputation: 115Reputation: 115
Quote:
There's a Perl module on CPAN, so as long as we're happy making a program to slurp up all the input into memory and sort it that way it's a trivial matter to use the module. Maybe the utility could be called "natsort". Making an efficient program to sort huge files which don't fit in memory is another matter. One would hope that adding the feature to GNU sort would just be a matter of having a comparison function, and adding the command line option.
With scripting languages, just about every computer performance metric flies out the window. But apparently, using a C program is out of the question, and if/when after such an extension is added to GNU sort, it takes time to reach ubiquity. So, we're stuck with scripting. But if there aren't any more than say, 100-300 files, it should be okay.

Hey, this sounds like something cool to do in Lisp. I've been meaning to do something more than "Hello World", and it doesn't look like this has been done before.
 
Old 12-06-2006, 05:16 AM   #10
Guttorm
Senior Member
 
Registered: Dec 2003
Location: Trondheim, Norway
Distribution: Debian and Ubuntu
Posts: 1,370

Rep: Reputation: 387Reputation: 387Reputation: 387Reputation: 387
Hi

PHP has this function, so if you have PHP installed, you can use this script:

Code:
#!/usr/bin/php
<?php

if ($argc == 2)
        $in_file = $argv[1];
else
        $in_file = "php://stdin";

$fp = fopen($in_file,"r")
or die("Failed opening $in_file");
$data = array();
while ($line = fgets($fp,5000))
        $data[] = $line;
fclose($fp);
natsort($data);
foreach ($data as $line)
        echo $line;
?>
You might need to set the path to php in the first line.

If you set execute rights on the script, it will run like every other script, even if its PHP.

When running the script, you can use a filename as parameter, if you don't specify it, it will read from stdin.
 
Old 12-06-2006, 02:53 PM   #11
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 732

Rep: Reputation: 75
Hi.

Given the data file data2:
Code:
file0.dat
file10.dat
file1.dat
bananas0.dat
bananas10.dat
bananas1.dat
img0.jpg
img10.jpg
img1.jpg
Picture123.jpg
img123.jpg
Picture06.jpg
Picture006.jpg
Picture6.jpg
Picture6.gif
Picture8600.jpg
Operated on by script s1:
Code:
#!/bin/sh

# @(#) s1       Demonstrate key extraction for embedded numeric string.
# $Id$

F=${1-data2}
sed -e 's/^\([a-zA-Z]*\)\([0-9]*\)\(.*\)/\1\3 \2 \1\2\3/' $F |
sort -k 1,1 -k 2,2n |
sed -e 's/^.* //'
will produce:
Code:
% ./s1
Picture6.gif
Picture006.jpg
Picture06.jpg
Picture6.jpg
Picture123.jpg
Picture8600.jpg
bananas0.dat
bananas1.dat
bananas10.dat
file0.dat
file1.dat
file10.dat
img0.jpg
img1.jpg
img10.jpg
img123.jpg
The alpha and numeric strings are extracted and placed ahead of the entire filename. Then fields 1 and 2 are sorted, the latter as a numeric field. Finally the extracted key fields are discarded.

Not bullet-proof, but good enough for government work ... cheers, makyo

( edit 1: typo; missing single-quote )

Last edited by makyo; 12-06-2006 at 10:27 PM.
 
Old 12-06-2006, 03:07 PM   #12
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
makyo, that's really neat. I have to spend some time to understand it.

Having said that, it's not a general solution to the natural sort problem. It fails to work correctly the numeric portion precedes the non-numeric part.

For example, this list won't be sorted correctly.
Code:
02Al002
001Al201
001Al3
3Al001
30Al001
One would be able to properly sort this data with a modification to the script, but it would not be possible to properly sort this data mixed with data which is alpha then numeric, or even more complex alpha, num, alpha, num etc.
 
Old 12-06-2006, 05:02 PM   #13
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
I just found that there is a patch for the GNU sort program:

http://sourcefrog.net/projects/natsort/textutils.diff

This adds natural sorting to GNU sort. There is also a stand-along natural sort utility written by the guy who made the patch, called natsort.
 
Old 12-06-2006, 08:26 PM   #14
burninGpi
Member
 
Registered: Mar 2006
Location: Fort McMurray, Canada
Distribution: Gentoo ~amd64
Posts: 163

Rep: Reputation: 30
here's a REALLY cheap way to do this:
Code:
#!/bin/bash
ls >/tmp/files
cat << EOF >/tmp/natsort.c
*** insert c code for natural sorting here ***
EOF
gcc /tmp/natsort.c -o /tmp/natsort
/tmp/natsort /tmp/files
rm -f /tmp/files /tmp/natsort
 
Old 12-06-2006, 09:20 PM   #15
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594
here's a REALLY cheap way to do this
Actually its rather expensive cuz it won't work: you'll be missing the header file.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bash script to sort image files dtcs Programming 5 09-26-2006 09:50 PM
bash sort files by date in file name thedude2010 Programming 6 05-12-2006 11:07 AM
bash script to sort files by extension otheralex Programming 7 08-19-2005 02:40 AM
gawk & sort commands in unix fanatic_ravi Linux - Software 0 01-25-2005 04:10 AM
How to loop or sort in bash, awk or sed? j4r0d Programming 1 09-09-2004 03:22 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:43 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration