test file processing question
hi all ,
I have a dilemma I hope you can help me solve.
I have a largish text file that I want to process.
The file has 4 colums separated by tab. so it looks like this.
fred john dave pete
dave pete terry phil
john dave pete fred
I would like to remove all lines where there are more than one duplicate entry in column 4.
I am not looking to remove duplicates, I want to completely remove the lines that have more than one entry in column 4.
So if 2 or more entries in colum 4 are the same, remove those two rows. I want to be left with only rows who only ever had a single entry in column 4.
Could you help with this? If so, thanks in advance.
ps. I am using Centos so I guess tools like grep and awk might do it I just dont know how.
Last edited by seabro; 05-22-2012 at 06:06 PM.