LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 12-28-2010, 11:41 AM   #1
antcore99
LQ Newbie
 
Registered: Dec 2010
Posts: 7

Rep: Reputation: 0
duplicate files and a basedirectory


I have two directories, I want to know which files in the second directory also appear in the first and delete the duplicate in the second directory. Filenames might be different (so that rules out diff).

My problem is that various programs (such as fdupes and freedup) are very capable of finding duplicate files but randomly delete (or link) files from the first or the second directory.

Here an example with fdupes:

Code:
[user@pc ~]$ fdupes -rdN /dir1/  /dir2/

  [+] /dir2/Geluidsclip 03.wav
  [-] /dir1/Geluidsclip 03.wav


  [+] /dir2/Geluidsclip 04.wav
  [-] /dir1/Geluidsclip 04.wav


  [+] /dir1/Geluidsclip 07.wav
  [-] /dir2/Geluidsclip 07.wav

As you can see, the file in the third pair is removed from dir1 instead of from dir2. My aim is to have only files deleted from the dir1. I know that fdupes can't do this, as I emailed with the author.
 
Old 12-28-2010, 12:23 PM   #2
antcore99
LQ Newbie
 
Registered: Dec 2010
Posts: 7

Original Poster
Rep: Reputation: 0
By grepping the output on "dir2" and then removing all those entries by eg. feeding them to xargs.
 
Old 12-28-2010, 12:32 PM   #3
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
You could use `find' to locate the files in the two directories, with -exec md5sum '{}' to produce an md5sum listing of each file.
Then you can use `uniq' to show the duplicates.

find /dir1 /dir2 -type f -exec md5sum '{}' \; | sort | uniq -w32 -D >duplicates

You could use grep to just return files from dir2/ and delete these files.

# make sure duplicates aren't in same directory. The following should be empty. If not, remove the entries from "dupicates" file
grep /dir2 duplicates | sort -w32 -D

#delete duplicates in /dir2
grep /dir2 duplicates | cut -d' ' -f3 | tr '\n' '\0' | xargs -0 rm -v

Last edited by jschiwal; 12-28-2010 at 12:33 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Find duplicate files by name xzased Linux - General 10 12-05-2012 06:31 AM
duplicate files sarew Linux - Newbie 7 07-06-2010 09:30 AM
does tar or bzip2 squash duplicate or near-duplicate files? garydale Linux - Software 6 11-19-2009 04:43 PM
Find Duplicate Files caponewgp Linux - Newbie 9 09-10-2009 12:20 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 08:25 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration