remove duplicate entries from first column??

kadvar · 05-12-2010, 05:20 PM

Hi I have a huge (over 10 gb) file with a list of IP's each followed by a corresponding number like this:

Code:

12.32.34.23  10
143.32.34.543  11
232.32.45.65  12
54.23.5.232  13
143.32.34.43  14

and so on..

I'm trying to sort this file numerically and weed out any duplicate IP addresses. How do I do this on bash? I have come up with this but obviously it does'nt work.

Code:

$sort -n myfile.txt | cut -f1 | uniq -u

Please remember that this needs to scale up to a huge file and my machine only has about 2 GB or RAM.

Thanks,
Adi

kadvar · 05-12-2010, 05:34 PM

Nevermind, got my answer:

I did a:

Code:

sort -n -u -k1 myfile.txt

grail · 05-12-2010, 06:22 PM

Please mark as SOLVED once you have your answer.