awk: sort variable values and assign a name accordingly
%%%%%
|
hmmm ... not sure I follow the logic? For a start, why would you need a for loop when each field2 can only be broken into a single array?? What I mean is, the values for the c2
array will never change, for each individual line, so a for loop only makes the if run 'a' times for no real reason. I would also hazard that as 'a' contains the number of pieces the field is split into, this would indicate how many letters you require, so if a = 3 then you only need 'a' and 'b' if it is 6 then you need 'a', 'b', 'c' & 'd' You also test if $2 has a space in it ... is there ever a case it does not? (example shows it is not required) |
%%%%%
|
One of my problem is that I don't see how the asort() or asorti() functions work, and if the function assign new indices to the values after sorting them.
It seems that asort() and asorti() sort indices and not values. And it does it vertically (add new lines for every indice) ! |
asort - sort the values of the array, ie. a[1] = 2 and a[2] = 1, once sorted the new array will be,
Code:
asort(a, b) Code:
asort(a, b) Code:
print a[b[1]] As for creating an array equal to the alphabet, the simplest is: Code:
split("abcd...", c2, "") So, using the above and something similar to the following logic: Code:
awk -F"\t" '{n = split($2, a, "[ ;]+");asort(a, b);for(i=1;i<=(n-(n/3));i++)print i,b[i]}' file The trick now will be how you get it back in the correct order :) |
Thanks grail !
Now I understand how these functions work. Just another question about asort: Does it sort only numerical values? In my case, the values "AGE" will be sorted before/after the numbers? Or discarded maybe? |
No everything is sorted, but as numbers are prior to the alphabet they will be in the first set of indexes :)
|
%%%%%
|
As far as asort goes, the best way is to have a go and check the results. What I can tell you is that you need to remember the sort just looks at all items and is not selective of doubles,
just the same as if you had to numbers the same they will both be one after the other, ie. 1 4 4 6 As for your final piece of code, you need to think about what you are passing: Code:
b[1] = "a" Here I would have said you have confused asort and asorti. Only the latter works in the way you are trying to use as it is the actual index (ie. what the "i" stands for) that will give the desired result. |
%%%%%
|
How about:
Code:
awk -F"\t" 'BEGIN{OFS=FS="\t";split("abcdefghijk",letters,"")}{split($2, a, "[ ;]+");n = asort(a, b);f3 = gensub(/AGE /,"","g",$2);for(i=1;i<=(n-(n/3));i++)sub(b[i],letters[i],f3);$3=f3}1' file |
%%%%%
|
All times are GMT -5. The time now is 01:06 PM. |