LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Reverse words in each line (https://www.linuxquestions.org/questions/programming-9/reverse-words-in-each-line-4175423234/)

danielbmartin 08-21-2012 09:42 AM

Reverse words in each line
 
Have a file of character strings:
Code:

one  two three    four five 
  first second    third    fourth fifth
time for a coffee break

(Note: some lines have leading and/or trailing blanks.)

Want:
Code:

  five four    three two one
fifth fourth    third    second first 
  break coffee a for time

I wrote an awk
Code:

awk '{for (i=NF;i>0;i--) printf("%s ",$i)} {printf("%s","\n")}' $InFile
... which reverses the words in a line, but it collapses strings of blanks into a single blank, as shown:
Code:

five four three two one
fifth fourth third second first
break coffee a for time

Please advise.

Daniel B. Martin

druuna 08-21-2012 09:52 AM

Have a look at the rev command. Answered too fast, sorry!

firstfire 08-21-2012 01:27 PM

Hi.

Here is my attempt in sed:
Code:

$ echo 'ab  cd ef' | sed -r 's/\w+| +/<&>/g; s/^.*$/\n&\n/;l; :a; s/(\n<[^<>]+>)(.*)(<[^<>]+>\n)/\3\2\1/; l; ta; s/[<>\n]//g'
\n<ab><  ><cd>< ><ef>\n$
<ef>\n<  ><cd>< >\n<ab>$
<ef>< >\n<cd>\n<  ><ab>$
<ef>< >\n<cd>\n<  ><ab>$
ef cd  ab

As always, the l commands are optional. The algorithm is the same as in `info sed examples reverse', except that I first delimit "words" and treat them as "characters".

A bit more details:
1) s/\w+| +/<&>/g; -- surround each word and block of spaces by <>.
2) s/^.*$/\n&\n/; -- surround whole line by newlines.
3) In a loop do the following transformation:
Code:

\n<word1> ... <word2>\n  -> <word2>\n ... \n<word1>
4) Remove <, > and\n characters.

BTW, here is my previous attempt :eek:
Code:

#!/bin/sed -rf
#surround line by \n markers
s/^.*$/\n&\n/
:a
        # mark left side of last word or block of spaces by \n:
        # \nword1 ... word2\n.. -> \nword1 ... \nword2\n..
        s/((\w+| +)\n)([^\n]*)$/\n\1\3/
        # swap the two marked blocks:
        # \nword1 ... \nword2\n.. -> word2\n ... \nword1..
        s/(\n(\w+| +))(.*)\n((\w+| +)\n)/\4\3\1/
l
        /\n\n/be
ba
:e
s/\n//g


danielbmartin 08-21-2012 02:49 PM

[QUOTE=firstfire;4760302]
Code:

sed -r 's/\w+| +/<&>/g; s/^.*$/\n&\n/;l; :a; s/(\n<[^<>]+>)(.*)(<[^<>]+>\n)/\3\2\1/; l; ta; s/[<>\n]//g'
Excellent!
The optional l commands help to understand the logic, step-by-step.

It will be interesting to see if an awk expert has a solution.

Daniel B. Martin

ntubski 08-21-2012 03:11 PM

Code:

awk -F'[ ]' '{for (i=NF;i>0;i--) printf("%s ",$i)} {printf("%s","\n")}' $InFile
See 4.5.1 Whitespace Normally Separates Fields and 4.5.2 Using Regular Expressions to Separate Fields.

danielbmartin 08-21-2012 03:31 PM

Quote:

Originally Posted by ntubski (Post 4760401)
Code:

awk -F'[ ]' '{for (i=NF;i>0;i--) printf("%s ",$i)} {printf("%s","\n")}' $InFile

Nice. Close to perfect. Every line in the output file has one more trailing blank than it should have. Is this an easy fix?

Daniel B. Martin

ntubski 08-22-2012 01:04 PM

Quote:

Originally Posted by danielbmartin (Post 4760414)
Every line in the output file has one more trailing blank than it should have. Is this an easy fix?

Yes.

But you knew that because it's your code, right? ;)

danielbmartin 08-22-2012 02:29 PM

Quote:

Originally Posted by ntubski (Post 4761333)
Yes.

But you knew that because it's your code, right? ;)

No, I don't. Sincerely. No smilies.

I tried to trim one blank from each line by adding a sed following your awk and couldn't make that work.

If you can show how to improve your awk that will be appreciated.

Daniel B. Martin

ntubski 08-22-2012 04:03 PM

Hey, it's really your awk, here is how to fix it:

Code:

awk -F'[ ]' '{for (i = NF; i > 0; i--) if (i > 1) printf("%s ", $i); else printf("%s\n", $i)}' $InFile

# short version, for golfers.
awk -F'[ ]' '{for(i=NF;i;i--)printf("%s"(i>1?" ":"\n"),$i)}' $InFile

The last word is $1 (because we are going in reverse), so we want to only print a space when i > 1 (ie before the last word). After the last word we put the newline.

danielbmartin 08-22-2012 05:17 PM

Quote:

Originally Posted by ntubski (Post 4761476)
Hey, it's really your awk ...

Okay, now I understand. You wrote it, you gave it to me, so now it's mine... and I'm delighted to have it. The output file is perfect.

The key to this clever awk is ...
Code:

-F'[ ]'
... and that's something I had not seen before.

Again, thank you for your patience and expertise.

Daniel B. Martin

ntubski 08-22-2012 06:04 PM

Quote:

Originally Posted by danielbmartin (Post 4761515)
Okay, now I understand. You wrote it, you gave it to me, so now it's mine... and I'm delighted to have it.

Um, I meant that it's the same code from your post #1.

danielbmartin 08-22-2012 08:47 PM

Quote:

Originally Posted by ntubski (Post 4761536)
Um, I meant that it's the same code from your post #1.

There is a similarity. There is also an important difference: yours works, mine didn't!

Daniel B. Martin

grail 08-23-2012 10:40 AM

I know you have said you might not be ready for Ruby before, but thought I would show you an alternative:
Code:

ruby -ape 'i=0;$_.chop!.reverse!.scan(/\w+/).each{|x| $_.gsub!(x,$F[i-=1])};$_ += "\n"' file
And to assist with understanding (for anyone looking):
Code:

-ape - a:create fields based on whitespace (default),p:read in each line and print at the end,e:following is script to be run

i=0 - set counter

$_ - the read line
chop! - remove terminating character (\n)
reverse! - reverse the string (in our case the whole line minus the terminating new line)
scan(/\w+/) - search the string recursively (important!) looking for words and return an array with each index referencing the words found
each - for each element of the array returned from scan perform the following tasks

|x| - variable with the copy of each element stored in it
gsub!(x,$F[i-=1]) - as per awk, find value stored in 'x' and replace with the value stored in global array (created by -a option) starting at the end of the array
                    and working backwards

$_ += "\n" - re-add the terminating new line character as -p option will print $_'s new value

! - all commands suffixed with '!' means perform this task on the object and make it change the object.  Without '!' it will only return a new object with the change performed

Hope that helps :)

danielbmartin 08-23-2012 03:11 PM

[QUOTE=grail;4762240]... thought I would show you an alternative:
Code:

ruby -ape 'i=0;$_.chop!.reverse!.scan(/\w+/).each{|x| $_.gsub!(x,$F[i-=1])};$_ += "\n"' file
Thank you for this intriguing sample of ruby. I'm still learning awk and not ready to venture beyond that.

I hear (and read) about the relative merits of perl, python, and ruby. At some point I will choose one.

Daniel B. Martin

ntubski 08-23-2012 08:47 PM

It might be interesting to look at the awk code translated to ruby:
Code:

ruby -pe '$_=$_.chop.split(" ").reverse.join(" ")+"\n"' file
That's basically Daniel's initial awk (with the trailing space also fixed), it has the same space collapsing problem. The fix is also the same: split on the regex matching space instead of the character space:
Code:

ruby -pe '$_=$_.chop.split(/ /, -1).reverse.join(" ")+"\n"' file
ruby's split also requires the -1 in order not to drop trailing spaces.


All times are GMT -5. The time now is 05:33 PM.