LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 01-28-2010, 11:55 PM   #1
anu_1
LQ Newbie
 
Registered: Jan 2010
Posts: 12

Rep: Reputation: 0
sort command help


In a file there are two entries -->
Windows NT
Windows2008

In AIX ==>
sort filename >
Windows NT
Windows2008

In Linux the same command with the same file produces
Windows2008
Windows NT

Could anyone please explain...is this because the space is treated differently in AIX & LINUX during sort...

Thanks for help
 
Old 01-29-2010, 02:50 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

Sort uses the locale specified in the environment (the LC_ALL=xxx setting), that is probably why there is a difference in the output.

Although not all sort version support it, try using AIX sort's -A option. You could also set LC_ALL to c (LC_ALL=C), but the latter may influence more then just sort!! Be careful if this is a production environment.

Hope this clears things up a bit.
 
1 members found this post helpful.
Old 01-29-2010, 02:54 AM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
I don't know about windows, but the sorting order in unix depends on the locale. Unicode sort order especially is different from the C/POSIX order. If you set your LC_COLLATE environment variable to either C or POSIX, the sorting of the above becomes the same.

Edit: Aargh, beaten by Druuna. But I can at least point out that setting LC_COLLATE only is more specific than setting LC_ALL, and won't affect the whole system.

Last edited by David the H.; 01-29-2010 at 02:59 AM.
 
1 members found this post helpful.
Old 01-29-2010, 03:18 AM   #4
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
Not sure on the answer because on my system, a space is sorted ahead of a "2" while "Windows2008" is sorted before "Windows NT".
I tried using -t' ' to change the field separator, without any difference. The order was always this way.

I took a peek at sort.c:
Code:
#ifdef POSIX_UNSPECIFIED
  /* The following block of code makes GNU sort incompatible with
     standard Unix sort, so it's ifdef'd out for now.
     The POSIX spec isn't clear on how to interpret this.
     FIXME: request clarification.

     From: kwzh@gnu.ai.mit.edu (Karl Heuer)
     Date: Thu, 30 May 96 12:20:41 -0400
     [Translated to POSIX 1003.1-2001 terminology by Paul Eggert.]

     [...]I believe I've found another bug in `sort'.

     $ cat /tmp/sort.in
     a b c 2 d
     pq rs 1 t
     $ textutils-1.15/src/sort -k1.7,1.7 </tmp/sort.in
     a b c 2 d
     pq rs 1 t
     $ /bin/sort -k1.7,1.7 </tmp/sort.in
     pq rs 1 t
     a b c 2 d

     Unix sort produced the answer I expected: sort on the single character
     in column 7.  GNU sort produced different results, because it disagrees
     on the interpretation of the key-end spec "M.N".  Unix sort reads this
     as "skip M-1 fields, then N-1 characters"; but GNU sort wants it to mean
     "skip M-1 fields, then either N-1 characters or the rest of the current
     field, whichever comes first".  This extra clause applies only to
     key-ends, not key-starts.
     */

  /* Make LIM point to the end of (one byte past) the current field.  */
  if (tab != NULL)
    {
      char *newlim;
      newlim = memchr (ptr, tab, lim - ptr);
      if (newlim)
        lim = newlim;
    }
  else
    {
      char *newlim;
      newlim = ptr;
      while (newlim < lim && blanks[to_uchar (*newlim)])
        ++newlim;
      while (newlim < lim && !blanks[to_uchar (*newlim)])
        ++newlim;
      lim = newlim;
    }
#endif
Actually, on this case, the original GNU interpretation is what I expected. A decimal point implies what follows is part of a field and not potentially several fields.

Last edited by jschiwal; 01-29-2010 at 03:50 AM.
 
1 members found this post helpful.
Old 01-29-2010, 04:34 AM   #5
anu_1
LQ Newbie
 
Registered: Jan 2010
Posts: 12

Original Poster
Rep: Reputation: 0
Thank you all for the explanations..
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
sort command fallloveuni Programming 3 01-24-2010 08:31 PM
Sort command arya6000 Linux - Newbie 2 11-27-2007 07:50 PM
Sort Command saravanan1979 Programming 1 10-03-2004 11:36 AM
The SORT command Rezon Programming 2 10-30-2003 04:14 PM
Using the Sort command in vi timnphx Programming 2 04-06-2001 11:39 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 06:45 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration