Hi,
am reading this GNU AWK users guide and there is one script connected to sorting. This one :
$0 = tolower($0) # remove case distinctions
# remove punctuation
gsub(/[^[:alnum:]_[:blank:]]/, "", $0)
for (i = 1; i <= NF; i++)
freq[$i]++
}
END {
sort = "sort +1 -nr"
for (word in freq)
printf "%s\t%d\n", word, freq[word] | sort
close(sort)
}
Was wondering how to limit the output? What I mean if I don't need every word but the 50 most frequent. Tried putting the output in the array and then in the for cycle to use the built in function delete
(smth like for(i=var;i>var;i++)delete array[i](where var is the variable given in the command line -v var) )but it's an error: attempt to use scalar as an array. Any suggestions?
P.S. yeap that can be done with Bourne Shell but why then am I reading that awk manual?
P.S.S the book I am reading "awk&sed" does not give any answer to this