Visit Jeremy's Blog.
Go Back > Forums > Linux Forums > Linux - General
User Name
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.


  Search this Thread
Old 10-26-2006, 10:35 AM   #1
Registered: Oct 2003
Location: CA
Posts: 165

Rep: Reputation: 30
searching for duplicate files, but named differently

I am looking for a way to find duplicate MP3s I run a radio station, and well we have several thousand duplicated MP3s but the names are all different, is there anyway someone can point me in the write direction?

All the files are named Mnnnnnn.mp3 or Pnnnnnn.mp3 (where n is a linear number) and I want to search through and delete the ones that are duplicated... I need some kind of list of files that were deleted afterward too... either in text or csv so I know what number are available to be used again.

Thanks in advance.

Old 10-26-2006, 11:08 AM   #2
Registered: Nov 2004
Location: Florida, USA
Distribution: Debian, Redhat
Posts: 416

Rep: Reputation: 53
You are probably going to want to create a database that lists what the song is and what it's md5sum is. This would give you a nice way to tell if there are duplicates.
Old 10-26-2006, 11:38 AM   #3
Senior Member
Registered: Dec 2003
Location: Trondheim, Norway
Distribution: Debian and Ubuntu
Posts: 1,293

Rep: Reputation: 335Reputation: 335Reputation: 335Reputation: 335

I had a little script to check for duplicate files.


find . -type f -exec md5sum {} ";" |sort >/tmp/md5sums.txt
cat /tmp/md5sums.txt |
while read line
        sum=$(echo $line |cut -d" " -f1)
        file=$(echo $line |cut -d" " -f2)
        if [ $sum = $last_sum ];
                echo "$file looks like a duplicate with $last_file"
                if [ -n $(diff -q $file $last_file) ];
                    echo "$file and $last_file are the same."
                    #rm $file
rm /tmp/md5sums.txt
I added a #rm $file there - if you want one of those deleted - my script did reporting only - just delete the # to get the file deleted. The script is not perfect by any means, but it did the job when I wanted to search for duplicates. I don't think it will like filenames with spaces in and such - use at your own risk!

Note - cd to the directory with the mp3 before you run it.
Old 10-27-2006, 05:08 AM   #4
Senior Member
Registered: Mar 2004
Location: england
Distribution: Debian, Mint, Puppy, Raspbian
Posts: 3,421

Rep: Reputation: 200Reputation: 200Reputation: 200

while read line
        sum=$(echo $line |cut -d" " -f1)
        file=$(echo $line |cut -d" " -f2)
this can be done neater without 'cut':

while read sum file
Old 10-27-2006, 05:15 AM   #5
Senior Member
Registered: Mar 2004
Location: england
Distribution: Debian, Mint, Puppy, Raspbian
Posts: 3,421

Rep: Reputation: 200Reputation: 200Reputation: 200
I have got a perl script that does a similar job to the bash but recurses down.
It produces a list by comparing the cksum of the files.

with mp3s though I should think if the ID3 tags are different then naturally
the cksum will be different.

see this script


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Software to find duplicate files mike_savoie Linux - Software 5 07-17-2010 04:04 PM
editors and duplicate files printf Linux - Newbie 7 11-22-2005 04:54 AM
duplicate files in one folder! hornung Linux - Enterprise 1 01-13-2005 04:35 PM
Duplicate Files and linux carl0ski Linux - Software 1 12-22-2004 05:45 PM
Howto find duplicate files js72 Linux - Software 1 11-09-2003 05:55 AM > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 03:52 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration