Sort File by Field - but with a Twist! ;)
I have a file like this:
saegh iubiae iabezu PATTERN cbizge atvet faw efenmi PATTERN beub htp rubwi riwbr iauebiubg ubneiu PATTERN aoihgr zvezg ... I want to sort the lines of the file with the field to the right of PATTERN as the sort key. The correctly sorted example file would look like this: iauebiubg ubneiu PATTERN aoihgr zvezg efenmi PATTERN beub htp rubwi riwbr saegh iubiae iabezu PATTERN cbizge atvet faw Any idea how to accomplish this? Thanks! moo-cow |
You could use the following to copy the expression following PATTERN to the beginning of each line:
Code:
sed -e "s/\(.*PATTERN \([^ ]\+\).*\)/\2 \1/" |
Nice little challenge there.
This works but may not be the most elegant solution: Code:
for NEXTWORD in `awk -FPATTERN '{print $2}' test |awk '{print $1}' |sort` NEXTWORD is an abitrary name for the variable - you can call it BILLYBOB or anything else you prefer. awk -FPATTERN '{print $2}' says to print anything that occurs after your PATTERN in the file. This of course starts with the next word following PATTERN. (-F tells it to use PATTERN as the delimiter instead of white space). This is then piped into the next awk which prints only the first word from the previous awk which is the word you were interested in sorting on. (Note this uses white space as the delimiter because as noted above that is the default for awk - if your next word contains any white space you'd have to figure out a different delimiter to use.) It then sorts the list of next words alphabetically using the sort command. Finally it greps for any line that contains the next word found by the awk/awk/sort combo that follows directly after your PATTERN (and for good measure puts a space between those and surrounding words so it doesn't accidentally hit on an embedded word). This will work fine so long as you only have the next word following pattern in your file once. If they appear twice it will still work relative to other next words but the two lines themselves may not be in the order you want. |
A small remark on jlightner's solution: It seems to me that a NEXTWORD appearing twice will also make each corresponding line appear twice in the result, as the file will be grepped twice for NEXTWORD. You can avoid this by piping the output of "sort" through "uniq".
|
Works great, thanks for your help!
|
Quote:
Restated: my solution can beat up your solution :p |
I was talking about the following effect
This is the content of the file to be sorted: Code:
saegh iubiae iabezu PATTERN cbizge atvet faw Code:
iauebiubg ubneiu PATTERN aoihgr zvezg Code:
iauebiubg ubneiu PATTERN aoihgr zvezg |
Quote:
Actually you made a good point. I was confused by you saying "NEXTWORD twice" because I was thinking you meant I used the variable twice - you meant the word the variable represented could have appeared twice. |
Quote:
|
All times are GMT -5. The time now is 01:56 PM. |