LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 11-29-2011, 11:17 AM   #1
SilversleevesX
Member
 
Registered: May 2009
Posts: 181
Blog Entries: 9

Rep: Reputation: 15
BASH sort string separated by commas


While the Subject line says BASH, I'm not averse to other scripting solutions (awk, perl, etc.).

Years ago, I was greatly impressed by Extensis Portfolio's innate ability to alphabetically sort keywords and supplemental categories added to those picture file types that supported IPTC and XMP metadata. I have yet to find another application, GUI or CLI, that does this in any OS on any hardware platform, and I've been searching off and on since 1996.

In the meantime, I've "taken the job in hand" to do my own descriptive metadata writing alphabetically. Here's where I've run into a problem. I don't always think alphabetically when describing a picture I wish to add keywords and supplemental categories to, and rearranging by hand can get tedious; even when it's not, it's somewhat time-consuming.

I've also encountered situations where, when adding a new key or supplemental to a picture's metadata, most GUI apps attach it to the end of the set, which means more editing on my part.

To make quick work of this editing is what I'm after. The best I've come up with in BASH shell is
Code:
echo "adult,amateur,happy,blonde,funny,waterslide,rain" | tr , "\n" | sort | tr "\n" , ; echo
Which returns
Quote:
adult,amateur,blonde,funny,happy,rain,waterslide,
Except for the trailing comma, I'd be satisfied. Call it petty, but a few months back, I rewrote three or four BASH scripts in such a way that a trailing comma was made part of the text, instead of being seen as a delimiter. I suppose editing out one comma takes far less time than rearranging whole strings of words, but frankly, I'd rather not have to.

As I mentioned before, I'm not married to, or insistent on, a BASH solution.

Looking forward to any help on this plane,

BZT

Last edited by SilversleevesX; 11-29-2011 at 11:21 AM. Reason: Two strings did not match for content
 
Old 11-29-2011, 11:48 AM   #2
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 718

Rep: Reputation: 72
Hi.

You might find msort to be useful. This script shows the context, then your data, then the output data, possibly needing to be pre-or-post-processed:
Code:
#!/usr/bin/env bash

# @(#) sh-minimal	Demonstrate record separator (,) with msort.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C msort

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " Results, noting missing record separator after rain:"
msort -r"," -w --comparison-type l --quiet $FILE
pe

pl " Results, adding record separator:"
sed 's/$/,/' $FILE |
msort -r"," -w --comparison-type l --quiet 
pe

pl " Results, post-process, remove embedded newline attached to rain:"
msort -r","  -n1 --comparison-type l --quiet $FILE |
tr -d '\n'
pe

pl " Results, post-process, remove newline, trailing record separator:"
msort -r","  -n1 --comparison-type l --quiet $FILE |
tr -d '\n' |
sed 's/,$//'
pe

exit 0
producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
GNU bash 3.2.39
msort 8.44

-----
 Input data file data1:
adult,amateur,happy,blonde,funny,waterslide,rain

-----
 Results, noting missing record separator after rain:
adult,amateur,blonde,funny,happy,rain
,waterslide,

-----
 Results, adding record separator:

,adult,amateur,blonde,funny,happy,rain,waterslide,

-----
 Results, post-process, remove embedded newline attached to rain:
adult,amateur,blonde,funny,happy,rain,waterslide,

-----
 Results, post-process, remove newline, trailing record separator:
adult,amateur,blonde,funny,happy,rain,waterslide
The man page for msort is short, but the on-line documentation is extensive, see: http://freecode.com/projects/msort

This code was in the Debian repository for me. Very, very useful in complicated situations,

Best wishes ... cheers, makyo
 
1 members found this post helpful.
Old 12-25-2011, 09:25 PM   #3
SilversleevesX
Member
 
Registered: May 2009
Posts: 181
Blog Entries: 9

Original Poster
Rep: Reputation: 15
Thanks for the tip on msort.

I'm going to copy your script suggestion and as soon as I have msort installed, I'll try it out on a few strings. Good to hear it's so well documented.

BZT
 
Old 12-25-2011, 10:46 PM   #4
Telengard
Member
 
Registered: Apr 2007
Location: USA
Distribution: Kubuntu 8.04
Posts: 579
Blog Entries: 8

Rep: Reputation: 147Reputation: 147
Here's my awk solution. Copy and paste the following code into a new file named sort-csv.

Code:
# Sorts comma delimited fields per line.
# Requires Gawk.

BEGIN {
    FS=","
    OFS=","
}

{
    split($0, words)
    asort(words)
    for (i in words) $i=words[i]
    print
}
Single lines may be sorted from the command line by piping into gawk -f sort-csv, just as you were doing.

Code:
$ echo "adult,amateur,happy,blonde,funny,waterslide,rain" | gawk -f sort-csv
adult,amateur,blonde,funny,happy,rain,waterslide
$
If you have a file which contains lines of comma delimited words, then you can sort the whole file at once with the syntax gawk -f sort-csv file-name.

Here's an example data file saved as my-list.csv.

Code:
strawberries,blueberries,strawberries
pencil,crayon,chalk,marker
bus,car,train,motorcycle,bicycle,skateboard
Here's how to invoke the script.

Code:
gawk -f sort-csv my-list.csv
Here's the output produced by the script.

Code:
blueberries,strawberries,strawberries
chalk,crayon,marker,pencil
bicycle,bus,car,motorcycle,skateboard,train
HTH

Last edited by Telengard; 12-25-2011 at 10:48 PM.
 
1 members found this post helpful.
Old 12-26-2011, 12:18 AM   #5
Dark_Helmet
Senior Member
 
Registered: Jan 2003
Posts: 2,786

Rep: Reputation: 369Reputation: 369Reputation: 369Reputation: 369
Ok, maybe I don't understand the problem completely, but if your only problem with your original command sequence is the trailing comma, then:
Code:
$ echo "adult,amateur,happy,blonde,funny,waterslide,rain" | tr , "\n" | sort | tr "\n" , | sed 's@\(.*\),@\1\n@'
adult,amateur,blonde,funny,happy,rain,waterslide
EDIT:
Or perhaps a non-regular-expression-using sed command:
Code:
$ echo "adult,amateur,happy,blonde,funny,waterslide,rain" | tr , "\n" | sort | tr "\n" , | sed 's@,$@\n@'
adult,amateur,blonde,funny,happy,rain,waterslide

Last edited by Dark_Helmet; 12-26-2011 at 12:24 AM.
 
  


Reply

Tags
awk, bash, sort


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Bash script to merge files together (given as a comma separated string) DomeKor Linux - Newbie 10 09-27-2011 11:29 PM
bash script stdin accept values separated with new lines, commas, spaces m4rtin Programming 6 12-30-2009 06:22 AM
[SOLVED] [bash] sort string and discard duplicates hashbang#! Programming 10 08-21-2009 06:17 AM
Parsing a comma separated CSV file where fields have commas in to trickyflash Linux - General 7 03-26-2009 03:30 PM
bash: execute a semicolon separated list of commands within string Meson Linux - General 3 10-01-2008 11:25 PM


All times are GMT -5. The time now is 06:34 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration