LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (http://www.linuxquestions.org/questions/linux-software-2/)
-   -   Finding duplicate lines in a file (http://www.linuxquestions.org/questions/linux-software-2/finding-duplicate-lines-in-a-file-617441/)

MikeyCarter 01-30-2008 12:38 PM

Finding duplicate lines in a file
 
I know how to remove duplicate lines in bash... but how to I get a report showing only the duplicate lines in a file?

MikeyCarter 01-30-2008 12:41 PM

Quote:

Originally Posted by MikeyCarter (Post 3040108)
I know how to remove duplicate lines in bash... but how to I get a report showing only the duplicate lines in a file?

Never mind... I search for hours... can't find anything. Post a question here.. and the answer magically presents itself... :scratch:


cmd: uniq

johnrw 10-05-2008 05:25 PM

Thanks for that one.
Ya know... I ran uniq on a file like so...
uniq -D -w 32 someMD5file.md5

and I know there were 13 identical md5sums... (of zero byte files) but it only printed 2 of them?

d41d8cd98f00b204e9800998ecf8427e is the troublesome md5sum.
Bug?

johnrw 10-05-2008 06:28 PM

A little further read led me to a smallprint gotcha...
http://www.linuxformat.co.uk/index.p...wtopic&p=63243 says:
Quote:

Now, it's important that we run sort before piping the output to
uniq, because uniq only removes duplicate adjacent lines. By sorting
the file beforehand, all instances of repetition are lumped
together, and therefore uniq removes everything but the first line
in a series of repetitive lines.
cat someMD5file.md5 | sort| uniq --all-repeated=separate -w 32
gave me the complete list of duplicate md5sums


All times are GMT -5. The time now is 03:21 AM.