LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 01-30-2008, 11:38 AM   #1
MikeyCarter
Member
 
Registered: Feb 2003
Location: Orangeville
Distribution: Fedora
Posts: 448

Rep: Reputation: 31
Finding duplicate lines in a file


I know how to remove duplicate lines in bash... but how to I get a report showing only the duplicate lines in a file?
 
Old 01-30-2008, 11:41 AM   #2
MikeyCarter
Member
 
Registered: Feb 2003
Location: Orangeville
Distribution: Fedora
Posts: 448

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by MikeyCarter View Post
I know how to remove duplicate lines in bash... but how to I get a report showing only the duplicate lines in a file?
Never mind... I search for hours... can't find anything. Post a question here.. and the answer magically presents itself...


cmd: uniq
 
Old 10-05-2008, 04:25 PM   #3
johnrw
LQ Newbie
 
Registered: Oct 2008
Posts: 2

Rep: Reputation: 0
Thanks for that one.
Ya know... I ran uniq on a file like so...
uniq -D -w 32 someMD5file.md5

and I know there were 13 identical md5sums... (of zero byte files) but it only printed 2 of them?

d41d8cd98f00b204e9800998ecf8427e is the troublesome md5sum.
Bug?

Last edited by johnrw; 10-05-2008 at 05:13 PM.
 
Old 10-05-2008, 05:28 PM   #4
johnrw
LQ Newbie
 
Registered: Oct 2008
Posts: 2

Rep: Reputation: 0
A little further read led me to a smallprint gotcha...
http://www.linuxformat.co.uk/index.p...wtopic&p=63243 says:
Quote:
Now, it's important that we run sort before piping the output to
uniq, because uniq only removes duplicate adjacent lines. By sorting
the file beforehand, all instances of repetition are lumped
together, and therefore uniq removes everything but the first line
in a series of repetitive lines.
cat someMD5file.md5 | sort| uniq --all-repeated=separate -w 32
gave me the complete list of duplicate md5sums
 
  


Reply

Tags
duplicates, find, remove


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
how do u delete duplicate lines bharatbsharma Programming 4 10-29-2007 06:04 PM
Finding duplicate files SlowCoder Linux - General 6 10-12-2007 08:25 AM
checking for duplicate lines in text files (vb.net) mrobertson Programming 11 08-01-2005 12:40 PM
Removing duplicate lines with sed tireseas Programming 10 01-12-2005 03:27 AM
Finding lines in file1,but not in file 2 subu_s Programming 2 12-14-2004 09:56 AM


All times are GMT -5. The time now is 05:39 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration