hi folks.
overall goal: list number of occurrences for all words in a spurious olde-english-sounding file. I'd like the output to be something like
words instances
and 17555
it 17530
came 17530
to 17530
pass 17523
some-word 4588
behooveth 677
yea 675
behold 666
sucketh 555
...
So far I've
1) downloaded text file to my linux system
2) ran this command to parse each word into its own line:
awk '{for(i=1;i<=NF;i++) print $i}' book_of_xxxmon.txt > outfile1.txt
3) sorted the data:
sort -d outfile1.txt > outfile2.txt
4) tried using sed to pull out punctuation (,.;

but ended up using OpenOffice Writer to do that manually > saved as outfile2.txt
5) pulled out the unique words:
uniq outfile2.txt > uniq.bom.txt
I *know* there has to be a cleaner and easier way to do all that but that's all I could do.
Now I'd like to use my new "uniq.bom.txt" file to compare it to the original file to count how many occurrences of each of these words are found in the original. I'd rather not have to manually go through my unique listing, run a command such as this to produce the listing --
echo 'pass'; grep 'pass' book_of_xxxmon.txt|wc -l >> final.list.txt
Any ideas (preferably better ones than mine...)?