Hi Gurus,
Been at Sed again and not getting too far.
I've loads of text files (which represent dictionaries of inverted text indexes) the content of which looks like this
Code:
475470
#term doc freq idx
carbendacime 1 114569
carbendacime35 1 114570
carbendazim 1 114571
carbene 5 114572
carbeni 5 114573
carbenicillin 4 114574
carbenoxolone 1 114575
carbethoxypsoralen 1 114576
Here I only care about the first and second tokens which are term and doc freq e.g. carbendacime and 1, carbendacime35 and 1, carbendazim and 1, etc.
I would like to use Sed to identify all terms which have a doc freq value of >=10, I then want to print this out the tuple to a new text file.
Any advice on whether to even use sed as oppose to awk would be greatly appreciated.
Thank you
Lewis