LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 02-11-2020, 05:38 PM   #1
aikempshall
Member
 
Registered: Nov 2003
Location: Bristol, Britain
Distribution: Slackware
Posts: 649

Rep: Reputation: 72
Problem sorting file


I have a file

Quote:
Mrs,NJ,Nicola,
Mr,SP,Stuart,D
Mrs,R,Rhian,Ro
Mr,S,Sam,Barto
Mrs,SA,Susan,R
Mr,S,Scott,Bro
Mrs,S,Shiralee,
Mr,S,Stephen,C
Mr,S,Symon,Wil
Mrs,TM,Tracey,
Mr,SV,Stephen,
I want to sort it into title order so the male title comes before the female title. So I use -

Code:
sort < input.unsorted
Which gives me

Quote:
Mrs,NJ,Nicola,
Mr,SP,Stuart,D
Mrs,R,Rhian,Ro
Mr,S,Sam,Barto
Mrs,SA,Susan,R
Mr,S,Scott,Bro
Mrs,S,Shiralee,
Mr,S,Stephen,C
Mr,S,Symon,Wil
Mrs,TM,Tracey,
Mr,SV,Stephen,
The file appears to remain unsorted!

Alex
 
Old 02-11-2020, 05:49 PM   #2
scasey
Senior Member
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.7.1908
Posts: 4,378

Rep: Reputation: 1552Reputation: 1552Reputation: 1552Reputation: 1552Reputation: 1552Reputation: 1552Reputation: 1552Reputation: 1552Reputation: 1552Reputation: 1552Reputation: 1552
Code:
sort < input.unsorted
Output is to STDOUT (that is, the screen) The file is not changed.

Also, the redirect is not necessary.
Code:
sort input.unsorted
will work as well, with the same result

See man sort for full details of how to use the sort command.
 
1 members found this post helpful.
Old 02-12-2020, 01:18 AM   #3
Turbocapitalist
Senior Member
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 4,462
Blog Entries: 3

Rep: Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230
The method used to capture and redirect data to make it flow into a program or file is called redirection. As you've done, you can see that you can redirect data into a program. You can also capture data coming out of that program and save it in a file. See the tutorial link there for examples and try a few. http://www.tldp.org/LDP/abs/html/io-redirection.html

Last edited by Turbocapitalist; 02-12-2020 at 01:20 AM.
 
Old 02-12-2020, 07:07 AM   #4
aikempshall
Member
 
Registered: Nov 2003
Location: Bristol, Britain
Distribution: Slackware
Posts: 649

Original Poster
Rep: Reputation: 72
I appreciate the information on redirection. However, what I would like to understand is why what appears on stdout is not sorted.
 
Old 02-12-2020, 07:27 AM   #5
Turbocapitalist
Senior Member
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 4,462
Blog Entries: 3

Rep: Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230
It is sorted. Look more closely. If you want to sort on various fields as delimited by commas, then use the -k and -t options.

Code:
sort -t , -f -k1,1 -k3,3 -k2,2 < input.unsorted
The -f option wouldn't hurt much either.

See "man sort".

Last edited by Turbocapitalist; 02-12-2020 at 07:29 AM. Reason: too many commas
 
3 members found this post helpful.
Old 02-12-2020, 10:31 AM   #6
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,759

Rep: Reputation: 610Reputation: 610Reputation: 610Reputation: 610Reputation: 610Reputation: 610
With this InFile ...
Code:
Mrs,NJ,Nicola,
Mr,SP,Stuart,D
Mrs,R,Rhian,Ro
Mr,S,Sam,Barto
Mrs,SA,Susan,R
Mr,S,Scott,Bro
Mrs,S,Shiralee,
Mr,S,Stephen,C
Mr,S,Symon,Wil
Mrs,TM,Tracey,
Mr,SV,Stephen,
... this sort ...
Code:
LC_ALL=C sort $InFile >$OutFile
... produced this OutFile ...
Code:
Mr,S,Sam,Barto
Mr,S,Scott,Bro
Mr,S,Stephen,C
Mr,S,Symon,Wil
Mr,SP,Stuart,D
Mr,SV,Stephen,
Mrs,NJ,Nicola,
Mrs,R,Rhian,Ro
Mrs,S,Shiralee,
Mrs,SA,Susan,R
Mrs,TM,Tracey,
Daniel B. Martin

.
 
2 members found this post helpful.
Old 02-12-2020, 12:24 PM   #7
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 9,411

Rep: Reputation: 4176Reputation: 4176Reputation: 4176Reputation: 4176Reputation: 4176Reputation: 4176Reputation: 4176Reputation: 4176Reputation: 4176Reputation: 4176Reputation: 4176
In your locale, and in mine (en_US.UTF-8), the sorting algorithm is clearly ignoring the commas.

From the "man sort" manpage:

Quote:
*** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values
Here's a link to information on how it works in UTF-8 locales:

https://unix.stackexchange.com/a/252426

Last edited by dugan; 02-12-2020 at 12:29 PM.
 
2 members found this post helpful.
Old 02-13-2020, 09:24 AM   #8
aikempshall
Member
 
Registered: Nov 2003
Location: Bristol, Britain
Distribution: Slackware
Posts: 649

Original Poster
Rep: Reputation: 72
OK, I think I'm starting to understand. My system wide profile has

Code:
export LC_ALL=en_GB
export LANG=en_GB

Which doesn't work for my example

Whereas the suggestion by danielbmartin

Code:
LC_ALL=C sort $InFile >$OutFile
works in a local ksh

The suggestion by Turbocapitalist

Code:
sort -t , -f -k1,1 -k3,3 -k2,2 < input.unsorted
also works in local ksh.

So which way to jump? It seems to depend of what's being sorted.

At the moment I've favoring.

Code:
export LC_ALL=
export LC_COLLATE=C
export LANG=en_GB
which also works in a local ksh.

So on a test machine I'm going to change the system wide profile. Give it a thorough test and see what transpires. The initial tests seem to show that this is the right way to go.

Thanks for pointing me in the right direction and deepening my understanding.

Thanks to dugan for highlighting

Quote:
*** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.
How could I have possibly missed such a warning! Must remember to thoroughly read the manual and to NOT ignore any warnings!


Again, thanks to all.

Alex
 
Old 02-13-2020, 10:54 AM   #9
boughtonp
Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 243

Rep: Reputation: 122Reputation: 122
Quote:
Originally Posted by aikempshall View Post
So on a test machine I'm going to change the system wide profile. Give it a thorough test and see what transpires. The initial tests seem to show that this is the right way to go.
That'll affect a bunch of other stuff too though? Why not just change it within the script(s) you're using sort for?

Or to make it global you could alias sort:
Code:
alias sort='LC_ALL=C sort'

Quote:
How could I have possibly missed such a warning! Must remember to thoroughly read the manual and to NOT ignore any warnings!
Well you probably assumed it was referring to things like accented letters coming together, and had nothing to do with ignoring commas because why should it?

 
Old 02-13-2020, 10:56 AM   #10
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,759

Rep: Reputation: 610Reputation: 610Reputation: 610Reputation: 610Reputation: 610Reputation: 610
Quote:
Originally Posted by aikempshall View Post
... So which way to jump? ..
If you aren't a stickler for purity consider this brute force solution. It avoids complications which arise from different geographies, computing environments, collating sequences.

With this InFile ...
Code:
Mrs,NJ,Nicola,
Mr,SP,Stuart,D
Mrs,R,Rhian,Ro
Mr,S,Sam,Barto
Mrs,SA,Susan,R
Mr,S,Scott,Bro
Mrs,S,Shiralee,
Mr,S,Stephen,C
Mr,S,Symon,Wil
Mrs,TM,Tracey,
Mr,SV,Stephen,
... this code ...
Code:
 sed 's/Mrs,/XXX,/' $InFile  \
|sort                        \
|sed 's/XXX,/Mrs,/'          \
>$OutFile
... produced this OutFile ...
Code:
Mr,SP,Stuart,D
Mr,S,Sam,Barto
Mr,S,Scott,Bro
Mr,S,Stephen,C
Mr,S,Symon,Wil
Mr,SV,Stephen,
Mrs,NJ,Nicola,
Mrs,R,Rhian,Ro
Mrs,SA,Susan,R
Mrs,S,Shiralee,
Mrs,TM,Tracey,

Daniel B. Martin

.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Rapid Listing, alphabetically sorting, dir/files sorting in C ? Xeratul Programming 18 11-24-2014 11:13 AM
[SOLVED] Chemistry problem- File matching and Sorting!!! robertselwyne Programming 9 07-12-2010 11:16 PM
Problem with sorting contents in a file - "10" being put up the top onesikgypo Programming 4 08-25-2009 08:19 AM
File sorting problem Kerridis Linux - General 3 03-08-2004 11:44 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:55 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration