LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 02-02-2012, 09:32 PM   #1
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,084

Rep: Reputation: 287Reputation: 287Reputation: 287
awk - sort words within each line


It is desired to sort all of the blank-delimited words within each line, without regard to the words in other lines. To illustrate by example, this input file ...
Quote:
Once upon a midnight dreary, while I pondered weak and weary,
Over many a quaint and curious volume of forgotten lore,
While I nodded, nearly napping, suddenly there came a tapping,
As of some one gently rapping, rapping at my chamber door.
''Tis some visitor,' I muttered, 'tapping at my chamber door -
Only this, and nothing more.
... would be changed to this output file ...
Quote:
a and dreary I midnight Once pondered upon weak weary while
a and curious forgotten lore many of Over quaint volume
a came I napping nearly nodded suddenly tapping there While
As at chamber door gently my of one rapping rapping some
and more nothing Only this
This is a solved problem! However, I am starting to learn awk and wonder if awk has a simpler way to perform this transformation.

An awk tutorial at ...
http://www.linuxjournal.com/article/8913
... says:
Quote:
Every awk program has three parts: a BEGIN block, which is executed once before any input is read; a main loop, which is executed for every line of input; and an END block, which is executed after all of the input is read.
This might be interpreted (or misinterpreted) to mean that a sort command in the main loop will sort the words in every line of input. Too good to be true? This code ...
Code:
cat < $InFile  \
|awk sort      \
> $Work3
... generates an empty file.

Daniel B. Martin
 
Old 02-02-2012, 10:33 PM   #2
firstfire
Member
 
Registered: Mar 2006
Location: Ekaterinburg, Russia
Distribution: Debian, Ubuntu
Posts: 623

Rep: Reputation: 364Reputation: 364Reputation: 364Reputation: 364
Hi.

Code:
$ awk -v IGNORECASE=1 '{gsub(/[[:punct:]]/, ""); split($0, w); s=""; for(i=1; i<=asort(w); i++) s=s w[i] " "; print s }' infile.txt 
a and dreary I midnight Once pondered upon weak weary while 
a and curious forgotten lore many of Over quaint volume 
a came I napping nearly nodded suddenly tapping there While 
As at chamber door gently my of one rapping rapping some 
at chamber door I muttered my some tapping Tis visitor 
and more nothing Only this
 
1 members found this post helpful.
Old 02-02-2012, 10:45 PM   #3
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
Firstfire, asort() is specific to GNU awk, so better use gawk (and not awk) since it won't work with generic awks. Plus you add trailing spaces.

Here's my version, long but well commented.
Code:
awk '{
       n = split("", list) # n = 0, list = empty array

       # For loop over all fields in this record
       for (f = 1; f <= NF; f++) {

           # i = the position to insert in the list
           i = 1

           # Skip earlier ones
           while (i <= n && list[i] < $f) i++

           # Move later ones
           for (o = n; o >= i; o--)
               list[o+1] = list[o]

           # Insert this field
           list[i] = $f

           # List grew by one
           n++
       }

       # output list, using spaces as separators.
       for (i = 1; i < n; i++)
           printf("%s ", list[i])
       printf("%s\n", list[n])
    }'
 
1 members found this post helpful.
Old 02-03-2012, 11:17 AM   #4
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,084

Original Poster
Rep: Reputation: 287Reputation: 287Reputation: 287
Thanks for both responses. I've timed both and find the execution times nearly equal, and much faster than the solution I coded before posting this thread.

Thank you, firstfire, for introducing me to asort. I stand in awe of a one-line solution to a non-trivial text-processing question.

Thank you, Nominal Animal, for the well-commented and instructive code.

It will take some time for me to digest the logic of these solutions. I'm travelling a path of self-education in Linux. At the outset I decided to develop a competence with use of grep, sed, cut, paste, sort, uniq before moving on to awk. I now see that as a short-sighted approach and will henceforth delve deeper into awk.

This thread is marked SOLVED!

Daniel B. Martin
 
  


Reply

Tags
awk, sorting


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
AWK/BASH: get nth line from a file by getline feed to actions in a same awk line cristalp Programming 3 11-23-2011 11:38 AM
how do you replace text between two words in a whole file not just 1 line w/ sed/awk lityit Programming 5 11-04-2011 12:04 AM
Awk to extract phrase between two words on a line? grob115 Programming 12 05-26-2010 09:46 PM
Need to strip words from front of line. sed/awk/grep? joadoor Linux - Software 6 08-28-2006 04:39 AM
Is there a line limit with the sort utility? Trying to sort 130 million lines of text gruffy Linux - General 4 08-10-2006 08:40 PM


All times are GMT -5. The time now is 12:30 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration