LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-08-2018, 04:31 PM   #1
fillemon
LQ Newbie
 
Registered: Oct 2018
Posts: 1

Rep: Reputation: Disabled
find all the same named directories on a hard disk


Hello,

my harddisk has a great amount of the same named directories all over the place. most of the time they hold the same data. i don't wanna use a compare tool to go through each file... at least not at this time. i just would like to know where the same named directories are, and merge them together...

i haven't found a tool that can do this out of the box. i guess this has to be done by scripting. and i'm not a scripting wizard. i think it might be interesting having a script that compares one directory against the rest of the harddisk ?

can anybody point me to a solution ? for example which command to use: i thought of the find command ?

i think this might help me... but then i would need to check how much % is the same, and if it holds a large % of the same files, i would like to merge it.
find /disk -type d -name "Documents"



thank you very much
kind regards

Last edited by fillemon; 10-08-2018 at 04:37 PM.
 
Old 10-08-2018, 04:36 PM   #2
scasey
Senior Member
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.6
Posts: 3,646

Rep: Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209Reputation: 1209
Yup. find will do that for you with minimal (if any) scripting. See man find
 
Old 10-08-2018, 11:25 PM   #3
Turbocapitalist
Senior Member
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 4,112
Blog Entries: 3

Rep: Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013
You might look at fslint or fdupes.

Or if you work with find you can use -exec to make a checksum for each file and pipe that into sort and then into uniq. With the right option(s) uniq will show duplicates.
 
Old 10-09-2018, 05:07 AM   #4
fatmac
Senior Member
 
Registered: Sep 2011
Location: Upper Hale, Surrey/Hants Border, UK
Posts: 3,097

Rep: Reputation: Disabled
Maybe run tree & pipe it to a file to peruse later using any text program, (cat/less/more/vi/nano/etc).
(Then you could just copy them all into one directory.)

Code:
tree /home > textfile
cat textfile | grep pdf > file2

Last edited by fatmac; 10-09-2018 at 05:12 AM.
 
Old 10-09-2018, 07:40 AM   #5
l0f4r0
Member
 
Registered: Jul 2018
Location: Paris
Distribution: Debian
Posts: 852

Rep: Reputation: 286Reputation: 286Reputation: 286
Quote:
Originally Posted by fatmac View Post
Maybe run tree & pipe it to a file to peruse later using any text program, (cat/less/more/vi/nano/etc).
(Then you could just copy them all into one directory.)
Code:
tree /home > textfile
cat textfile | grep pdf > file2
If I'm correct, your method is manual (grepping of strings) then tedious to implement.
And cat is unecessary (--> grep 'pdf' textfile)

@fillemon:
IMO, the simplest approach is Turbocapitalist's.
I would suggest something like:
Code:
find /disk -type d -iname '*documents*' -exec basename {} \; | sort | uniq -d
It will give you the name of the duplicate folders containing at least "documents" in their name (case insensitive). If need be, you can then do some find on each folder name found to get the path to them:
Code:
find /disk -type d -name 'exact_folder_name'
However, I don't get it with your original post because you indicated:
Code:
find /disk -type d -name "Documents"
As is, it implies that you already know what to search for (directories exactly named "Documents"). So this command already gives you what you want...

Last edited by l0f4r0; 10-09-2018 at 07:45 AM.
 
Old 10-09-2018, 07:47 AM   #6
Turbocapitalist
Senior Member
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 4,112
Blog Entries: 3

Rep: Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013
Quote:
Originally Posted by l0f4r0 View Post
I would suggest something like:
I was thinking something more along these lines to compare the content and not the file names:

Code:
find /dir01/ /dir02/ -type f -exec md5sum {} \; \
| sort | uniq --check-chars=32 -D
Other hash algorithms could be used instead, but there's a rather low chance of an accidental MD5 collision.

Last edited by Turbocapitalist; 10-09-2018 at 07:52 AM. Reason: d -> D
 
Old 10-09-2018, 07:54 AM   #7
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 18,086

Rep: Reputation: 2905Reputation: 2905Reputation: 2905Reputation: 2905Reputation: 2905Reputation: 2905Reputation: 2905Reputation: 2905Reputation: 2905Reputation: 2905Reputation: 2905
Quote:
Originally Posted by fillemon View Post
i don't wanna use a compare tool to go through each file... at least not at this time.
Given the requirements for just names, I'd use locate - much less hammering of the hardware till the OP actually decides what they want to do.

Of course that may only a subset depending on prune options, but should be pretty complete for most use cases.
 
Old 10-09-2018, 07:58 AM   #8
l0f4r0
Member
 
Registered: Jul 2018
Location: Paris
Distribution: Debian
Posts: 852

Rep: Reputation: 286Reputation: 286Reputation: 286
Quote:
Originally Posted by Turbocapitalist View Post
I was thinking something more along these lines to compare the content and not the file names:
Code:
find /dir01/ /dir02/ -type f -exec md5sum {} \; \
| sort | uniq --check-chars=32 -D
Okay but you are searching for files and not directories

@fillemon: can you be more specific please? Do you want to search folders that have the same names, or duplicate files?
 
Old 10-09-2018, 08:31 AM   #9
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: FreeBSD/Slackware-14.2+/ArcoLinux
Posts: 9,082

Rep: Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903Reputation: 1903
you'd might have a two part'er, first find dup directories, then using that search for dup file in the found dup directories. Going back to @Turbocapitalist suggestion of using fslint or fdupes. Perhaps along with find.

or something "crazy" like this, NOT Completely working code, the theory is there, but needs work. (I'm not feeling like setting up a dup directories, and dup files, to test this code until I get it to work properly)
Code:
#!/bin/bash

#set -x

count1=0
count2=0

working_dir1=/run/media/userx/3TB-External
working_dir2=/media/ntfs1

while read d1 ; do

 echo "outer Loop $count1"
 echo "$d1"
	while read d2 ; do
		echo "Inner loop"
		echo "$d2"
	echo "	[[ "$d1" =~ "$d2" ]] && ( echo "match" ; echo "reset find" ; exit ) "
		[[ "$d1" =~ "$d2" ]] && ( echo "match" ; echo "reset find" ; exit )
		((count2++))
	done <<<"$(find "$working_dir2" -type d)"
	
	  [[ "$count2" -ge '10' ]] && count2=0
	((count1++))
done <<<"$(find "$working_dir1" -type d)"
you could even add it to move all of the files into one central location ,then delete the directories they came from to clean up after moving everything out of them.

Last edited by BW-userx; 10-09-2018 at 10:47 AM.
 
Old 10-09-2018, 09:14 AM   #10
fatmac
Senior Member
 
Registered: Sep 2011
Location: Upper Hale, Surrey/Hants Border, UK
Posts: 3,097

Rep: Reputation: Disabled
@ l0f4r0

Just trying to keep things simple as it is the OPs first post.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Find Files Named with 'text.rpt' in All Directories and Subdirectories using Perl newbi2014 Linux - Newbie 9 12-08-2014 10:22 PM
Size of Hard Disk partitions for linux directories saurav_at_nandy Linux - Desktop 7 05-20-2013 02:43 PM
Copying files and sub-directories of a directory except the directories named ".abc" sri1025 Linux - General 2 08-24-2010 08:53 AM
Why my named doesnt find named.root Yxaaaaaaa Linux - Software 4 01-26-2007 03:48 PM
Can't find hard disk recursv Slackware - Installation 4 06-08-2006 03:10 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 04:39 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration