Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Can any one help me with the script which can be used to find the duplicate files in a directory. I have directory called dir1 which has some sub-directories and there are some .i and .o files in it. There are some duplicate files in the different directories. I want to identify them and copy the files to other directory leaving the duplicates in the same dir1.
How can you identify the duplicates? Is the name enough or do you also need to check the size and/or checksum?
I do not understand There are some duplicate files in the different directories. I want to identify them and copy the files to other directory leaving the duplicates in the same dir1. When you identify a file in a sub-directory of dir1 which is a duplicate of a file in dir1, which "other" directory do you want to copy it to and is it OK to change from having two duplicates to having 3 duplicates?
How can you identify the duplicates? Is the name enough or do you also need to check the size and/or checksum?
I do not understand There are some duplicate files in the different directories. I want to identify them and copy the files to other directory leaving the duplicates in the same dir1. When you identify a file in a sub-directory of dir1 which is a duplicate of a file in dir1, which "other" directory do you want to copy it to and is it OK to change from having two duplicates to having 3 duplicates?
Hi Catkin,
Thanks for your reply.
Identifying duplicate files with name will help me in first place. I have a directory in which i have many sub-directories under which i have *.i and *.o files. Now i want to identify the duplicate files and do the following,
1. If the files are with same name then it has to display the location of the files with same name and store the information to some file.
2. After that it has to check for the contents of file and if the contents of the file are same it has to delete one copy and retain the other so i don't have a duplicate copy with same content.
Hope you understood what i am looking for, could please help me here?
Idea for a brute force way (OK if not doing often on large number of files), not tested
Code:
find dir1 -type f \( -name '*.i' -o -name '*.o' \) -print0 | while IFS= read -r -d '' filename1 # Note 1
do
basename=<stuff> # Remove the path from $filename1, leaving only the basename
count=0
find dir1 -mindepth 2 -type f -name "$basename" -print0 | while IFS= read -r -d '' filename2
do
echo "'$filename1' duplicate found at '$filename2'" >> output.txt
let count++
done
if [[ $count -eq 1 ]]; then
<do file comparing, moving or deleting stuff stuff>
fi
done
Idea for a brute force way (OK if not doing often on large number of files), not tested
Code:
find dir1 -type f \( -name '*.i' -o -name '*.o' \) -print0 | while IFS= read -r -d '' filename1 # Note 1
do
basename=<stuff> # Remove the path from $filename1, leaving only the basename
count=0
find dir1 -mindepth 2 -type f -name "$basename" -print0 | while IFS= read -r -d '' filename2
do
echo "'$filename1' duplicate found at '$filename2'" >> output.txt
let count++
done
if [[ $count -eq 1 ]]; then
<do file comparing, moving or deleting stuff stuff>
fi
done
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.