Sorting on an interior field
Have: a file of records having this format...
Code:
Name SexAge~StreetAddress~ZipCode Code:
Bush, George Herbert Walker M70~1600 Pennsylvania Ave ~20500 One approach: devise a clever RegEx would replace the blank which precedes SexAge with a tilde. Please advise. sed or awk solutions are preferred. |
First, because you have spaces within fields, you need field separators which are not spaces. Then, to sort the file, you can use---you guessed it---the sort command!
I created a test file to demonstrate: Code:
[mherring@herring_desk play]$ more list Example: Code:
echo "Bush, George Herbert Walker M70~1600 Pennsylvania Ave ~20500" | sed 's/ \([MFU][0-9]*~\)/~\1/' |
[QUOTE=pixellany;4817119]
Code:
sed 's/ \([MFU][0-9]*~\)/~\1/' Daniel B. Martin |
A possible awk solution
I put these lines in a file named data.txt:
Code:
Wells, Herbert George M99~1 Main St ~60126 and this program in a file named sort_ifld.gawk: Code:
BEGIN { Code:
gawk -f sort_ifld.gawk < data.txt Code:
Greenburg, Anna Olivia F28~15 Baker Court ~90156 The problem with just a simple sort, is that the age won't be treated as a number. Instead it will be treated as text. If you just run sort on this list of numbers: Code:
1 Code:
1 Code:
001 |
Quote:
I wonder why the sort command does not have an option to deal with numbers? |
Quote:
Daniel B. Martin |
So just for giggles, here is ruby alternative (hopefully with enough comments for others to follow):
Code:
ruby -ne 'BEGIN{h=Hash.new{|k, p| k[p] = []}}; # Set up new hash of arrays Of course a more complex solution would be required if we were to assume (correctly based on a voter format) that there are multiple people of the same age and sex. |
Quote:
|
Quote:
Quote:
Code:
echo; echo "Sort on Sexage" The sort reorders the file based in the second tilde-delimited field. The second sed replaces all tildes with pairs of blanks Thanks to everyone who contributed ideas on this subject. Daniel B. Martin |
To modify Pixellany's version to sort by the number:
Code:
sort -t _ -k 2.2,2n Read sort's info page for full details on how to limit your sorting fields. |
[QUOTE=David the H.;4818326]
Code:
sort -t ~ -k 2.2,2n Daniel B. Martin |
All times are GMT -5. The time now is 07:09 PM. |