LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 10-26-2006, 09:35 AM   #1
matthewhardwick
Member
 
Registered: Oct 2003
Location: CA
Posts: 165

Rep: Reputation: 30
searching for duplicate files, but named differently


I am looking for a way to find duplicate MP3s I run a radio station, and well we have several thousand duplicated MP3s but the names are all different, is there anyway someone can point me in the write direction?

All the files are named Mnnnnnn.mp3 or Pnnnnnn.mp3 (where n is a linear number) and I want to search through and delete the ones that are duplicated... I need some kind of list of files that were deleted afterward too... either in text or csv so I know what number are available to be used again.

Thanks in advance.

Matthew.
 
Old 10-26-2006, 10:08 AM   #2
Wells
Member
 
Registered: Nov 2004
Location: Florida, USA
Distribution: Debian, Redhat
Posts: 383

Rep: Reputation: 31
You are probably going to want to create a database that lists what the song is and what it's md5sum is. This would give you a nice way to tell if there are duplicates.
 
Old 10-26-2006, 10:38 AM   #3
Guttorm
Senior Member
 
Registered: Dec 2003
Location: Trondheim, Norway
Distribution: Debian and Ubuntu
Posts: 1,136

Rep: Reputation: 230Reputation: 230Reputation: 230
Hi.

I had a little script to check for duplicate files.

Code:
#!/bin/bash

last_sum=x
last_file=x
find . -type f -exec md5sum {} ";" |sort >/tmp/md5sums.txt
cat /tmp/md5sums.txt |
while read line
do
        sum=$(echo $line |cut -d" " -f1)
        file=$(echo $line |cut -d" " -f2)
        if [ $sum = $last_sum ];
        then
                echo "$file looks like a duplicate with $last_file"
                if [ -n $(diff -q $file $last_file) ];
                then
                    echo "$file and $last_file are the same."
                    #rm $file
                fi
        fi      
        last_sum="$sum"  
        last_file="$file"
done
rm /tmp/md5sums.txt
I added a #rm $file there - if you want one of those deleted - my script did reporting only - just delete the # to get the file deleted. The script is not perfect by any means, but it did the job when I wanted to search for duplicates. I don't think it will like filenames with spaces in and such - use at your own risk!

Note - cd to the directory with the mp3 before you run it.
 
Old 10-27-2006, 04:08 AM   #4
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,287

Rep: Reputation: 173Reputation: 173
FYI


Code:
while read line
do
        sum=$(echo $line |cut -d" " -f1)
        file=$(echo $line |cut -d" " -f2)
this can be done neater without 'cut':

Code:
while read sum file
do
       ....
 
Old 10-27-2006, 04:15 AM   #5
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,287

Rep: Reputation: 173Reputation: 173
I have got a perl script that does a similar job to the bash but recurses down.
It produces a list by comparing the cksum of the files.


with mp3s though I should think if the ID3 tags are different then naturally
the cksum will be different.

see this script http://www.linuxquestions.org/questi...94#post2343194
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Software to find duplicate files mike_savoie Linux - Software 5 07-17-2010 03:04 PM
editors and duplicate files printf Linux - Newbie 7 11-22-2005 03:54 AM
duplicate files in one folder! hornung Linux - Enterprise 1 01-13-2005 03:35 PM
Duplicate Files and linux carl0ski Linux - Software 1 12-22-2004 04:45 PM
Howto find duplicate files js72 Linux - Software 1 11-09-2003 04:55 AM


All times are GMT -5. The time now is 05:19 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration