LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-10-2014, 01:06 PM   #1
thiyagusham
Member
 
Registered: Apr 2012
Posts: 213

Rep: Reputation: Disabled
sort command in linux


Hi ;

I have a dobut.
How sort works ? i don't understand the logic for numeric sortings.

PHP Code:
cat file5
15
23
125
225
456
456
1025 
>> what logic applied here ..

PHP Code:
sort file5
1025
125
15
225
23
456
456 
What i got it does sorting based on ascii but i dont understand the logic.
 
Old 04-10-2014, 01:08 PM   #2
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: CentOS 6 (pre-systemd)
Posts: 2,749

Rep: Reputation: 741Reputation: 741Reputation: 741Reputation: 741Reputation: 741Reputation: 741Reputation: 741
First columns are in order. Where first columns are the same, second columns are in order, etc.
 
1 members found this post helpful.
Old 04-10-2014, 01:17 PM   #3
jdkaye
LQ Guru
 
Registered: Dec 2008
Location: Westgate-on-Sea, Kent, UK
Distribution: Debian Testing Amd64
Posts: 5,464

Rep: Reputation: Disabled
Another way of looking at is that the sort command treats strings of numbers like strings of letters. It sorts them not according to their value but according to their postion in the "alphabet" (0-9 instead of a-z) in the way that smallpond described it. So 1025 comes before 125 for the same reason that bacf comes before bcf in normal alphabetization.
Hope that's clear.
jdk

Last edited by jdkaye; 04-10-2014 at 01:19 PM.
 
2 members found this post helpful.
Old 04-10-2014, 01:23 PM   #4
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,465

Rep: Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069
In addition to the above, if you want to sort them numerically use the "-n" switch.
 
1 members found this post helpful.
Old 04-10-2014, 01:24 PM   #5
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,114

Rep: Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874
its in lexicographical order.

check man sort.
if you want it in numerical order use the -n option.
 
1 members found this post helpful.
Old 04-10-2014, 01:32 PM   #6
thiyagusham
Member
 
Registered: Apr 2012
Posts: 213

Original Poster
Rep: Reputation: Disabled
Hi ;

Thanks jdkaye ;

Could you please elaborate some better example ?

I am NOT clear. Some confusions.
 
Old 04-10-2014, 01:37 PM   #7
thiyagusham
Member
 
Registered: Apr 2012
Posts: 213

Original Poster
Rep: Reputation: Disabled
Hi All ;

Clearly i have posted , what is the logic behind for my output .
I am NOT asking abt -n option. Please provide answers for my EXACT Question
 
Old 04-10-2014, 01:39 PM   #8
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,465

Rep: Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069Reputation: 2069
Why are you getting so fussy?

As you've been told three times already, it sorts it alphabetically. 1 comes before 2, 2 comes before 3, etc. It does this on individual characters, NOT on the entire number. If two lines start with the same character, it moves to the second character, then the third, same way you alphabetize any list.

Start by just looking at the first character
Code:
$ sort file5 
1
1
1
2
2
4
4
Ok, there are three 1s, so move to the next character and what do you get
Code:
0
2
5
How about the 2s
Code:
2
3
And the 456s are the same, so their "order" doesn't matter.

Last edited by suicidaleggroll; 04-10-2014 at 01:45 PM.
 
1 members found this post helpful.
Old 04-10-2014, 03:15 PM   #9
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 18,811

Rep: Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190
Quote:
Originally Posted by thiyagusham View Post
Hi All ;
Clearly i have posted , what is the logic behind for my output . I am NOT asking abt -n option. Please provide answers for my EXACT Question
Please READ AND UNDERSTAND the answers you have been given. smallpond, suicidaleggroll, and jdkaye have ALL told you exactly why you're getting that output. You were also pointed to the man page, and as far as I know, there is NOTHING keeping you from going to Google and doing further research, or even downloading the source code for sort and looking at it.

If you don't like the answers you get here, you can always ask elsewhere...perhaps they can fit your 'exact' needs. If you don't UNDERSTAND the answers you get here, being snotty sure won't get you much further.
 
Old 04-11-2014, 04:34 AM   #10
TheIndependentAquarius
Senior Member
 
Registered: Dec 2008
Posts: 4,679
Blog Entries: 29

Rep: Reputation: 917Reputation: 917Reputation: 917Reputation: 917Reputation: 917Reputation: 917Reputation: 917Reputation: 917
Here is a simplification of the excellent explanation of jdkaye.
Code:
$ sort file5
1025
125
15
225
23
456
456
You are thinking that 1025 is greater than 125 because you are
reading 1025 as one thousand twenty five, and 125 as one hundred
twenty five
.

The point here is that sort function doesn't read those numbers
the way you are reading them.

sort function is comparing these numbers digit by digit.

First it compared 1 from 1025 with 1 from 125, and found both are
equal,
then it compared 0 from 1025 with 2 from 125 and found that 0 is
smaller than 2, hence it placed 1025 ahead of 125.

Last edited by TheIndependentAquarius; 04-11-2014 at 04:36 AM.
 
1 members found this post helpful.
Old 04-11-2014, 04:57 AM   #11
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 9,220

Rep: Reputation: 2698Reputation: 2698Reputation: 2698Reputation: 2698Reputation: 2698Reputation: 2698Reputation: 2698Reputation: 2698Reputation: 2698Reputation: 2698Reputation: 2698
just try to replace 0->a, 1->b, 2->c and so on, sort those strings and replace again a->0, b->1, ...
 
Old 04-11-2014, 05:29 AM   #12
jdkaye
LQ Guru
 
Registered: Dec 2008
Location: Westgate-on-Sea, Kent, UK
Distribution: Debian Testing Amd64
Posts: 5,464

Rep: Reputation: Disabled
Quote:
Originally Posted by pan64 View Post
just try to replace 0->a, 1->b, 2->c and so on, sort those strings and replace again a->0, b->1, ...
That's exactly what I said in my response and the OP could only say
Quote:
Could you please elaborate some better example ?

I am NOT clear. Some confusions.
So much for our good intentions.
jdk
 
Old 04-11-2014, 06:26 AM   #13
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Distribution: UBUNTU 5.10 since Jul-18,2006 on Intel 820 DC
Posts: 832

Rep: Reputation: 200Reputation: 200Reputation: 200
It used to be called "sort-merge". That should explain the internal logic to you. This can be another thread. Briefly the `sort` command assumes line by line left to right character based sorting unless you give options. The `man` says it simply. "SORT LINES OF TEXT FILES".

You really should READ THE MANUAL.

SO AS OTHERS HAVE SAID REPEATEDLY, UNLESS YOU USE OPTIONS, SORT ASSUMES THAT THE LINES TO BE SORTED ARE CHARACTERS. UNLIKE NUMBERS AND DIGITS, THE ASCII VALUE USUALLY DETERMINES THE SEQUENCE.

You can use options to change the column delimiter, column number and nature of data to be sorted.

OK

Last edited by AnanthaP; 04-11-2014 at 06:29 AM.
 
Old 04-11-2014, 06:39 AM   #14
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,688

Rep: Reputation: 1259Reputation: 1259Reputation: 1259Reputation: 1259Reputation: 1259Reputation: 1259Reputation: 1259Reputation: 1259Reputation: 1259
People, he may be having trouble with English.

Don't think of the file as "numbers" they aren't.

Your sequence
Code:
1025
125

is actually the following sequence of bytes:

Code:
0000000   1   0   2   5  \n   1   2   5  \n
        061 060 062 065 012 061 062 065 012
Where the second line is the octal value for each byte of the input.

Sort uses the values from the second line to sort by. And doing that means that 060 (the second byte of the first line) is less than 062 (the second byte of the second line), thus the first line is placed first.

Using the -n option to sort causes sort to first convert the data into integers.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Sort command sceravo1 Linux - General 2 04-15-2013 09:37 AM
Linux bash script: sort command - not sorting properly lulwot Programming 3 04-19-2011 07:28 PM
sort command fallloveuni Programming 3 01-24-2010 08:31 PM
Advanced Linux Sort text file Command soulxcavtor Linux - Software 1 06-07-2006 10:58 PM
Using the Sort command in vi timnphx Programming 2 04-06-2001 11:39 PM


All times are GMT -5. The time now is 10:06 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration