LinuxQuestions.org - How to list duplicate filenames

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - How to list duplicate filenames (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-list-duplicate-filenames-917457/)

How to list duplicate filenames

How can I see a list of duplicate file names (even if content is different) under a given directory?

Hi and welcome to LQ.

I don't think you can have duplicate file names (ie. identical file names) in the same directory. You can have files with different extensions (eg. file.jpg, file.png, etc.) but I wouldn't call them duplicate.

List duplicate filenames under a subdirectory

Perhaps I should clarify:
- I want to see a list of all files named foo.txt under ~/
- thus files name ~/dir1/foo.txt, ~/dir1/subdir1/foo.txt, ~/dir2/foo.txt should all be listed out (so I can then manually verify that they are indeed different)

Hi,

Would this help?
In the directory where foo.txt files, do

Code:

find . -type f -name "foo.txt" -print

How to find duplicate filenames under a subdirectory

Thanks for the suggestion aazkan, it might help if I was looking for a specific file. I need to identify ALL duplicate filenames.

I guess you will need to write a script for it and that script will not be resource efficient as it has to compare the file names of each file with all other files in the the mentioned directory.

Do you have any knowledge of scripting ?

Is this a exam/classroom question ?

Hello,

LQ has a great search option, which turned up this one right here at LinuxQuestions. It should be a good base for what you need.

Here are some more possible solutions, found using Google in under a second.

Kind regards,

Eric

Code:

[demo@localhost temp]$ find `pwd` . | grep -v '^\.' | awk -F '/' '{print $NF,$0}' | sort +0 -1

bye /home/demo/temp/bye

bye /home/demo/temp/there/bye

duplicate.sh /home/demo/temp/duplicate.sh

hello_again /home/demo/temp/there/hello_again

hello /home/demo/temp/hello

hello /home/demo/temp/there/hello

temp /home/demo/temp

there /home/demo/temp/there

Code:

[demo@localhost temp]$ find `pwd` . | grep -v '^\.' | awk -F '/' '{print $NF,$0}' | sort +0 -1 | cut -d ' ' -f2

/home/demo/temp/bye

/home/demo/temp/there/bye

/home/demo/temp/duplicate.sh

/home/demo/temp/there/hello_again

/home/demo/temp/hello

/home/demo/temp/there/hello

/home/demo/temp

/home/demo/temp/there

[demo@localhost temp]$

Or:

For the Ease of Visualisation!

Code:

[demo@localhost temp]$ find `pwd` . | grep -v '^\.' | awk -F '/' '{print $NF " ->",$0}' | sort +0 -1

bye -> /home/demo/temp/bye

bye -> /home/demo/temp/there/bye

duplicate.sh -> /home/demo/temp/duplicate.sh

hello -> /home/demo/temp/hello

hello -> /home/demo/temp/there/hello

hello_again -> /home/demo/temp/there/hello_again

temp -> /home/demo/temp

there -> /home/demo/temp/there

Cheers!

Or put this in a file:

Code:

# Find Duplicate File Names - By Dev (dk_mahadeva@yahoo.com)

#

#!/bin/bash

path=$1

: ${path:=.}

fullList=$(find `pwd` $path | grep -v '^\.' | awk -F '/' '{print $NF,$0}' | sort +0 -1)

fileNames=$(find `pwd` $path | grep -v '^\.' | awk -F '/' '{print $NF}' | sort | uniq -d)

for EACH in $fileNames; do

        echo "$fullList" | grep -wE "^$EACH "

done

exit 0

Example:

Code:

[demo@localhost temp]$ ./findDup.sh /home 2> /dev/null | head -n 4

bye /home/demo/bye

bye /home/demo/temp/bye

bye /home/demo/temp/bye

bye /home/demo/temp/there/bye

If no command-line argument/parameter (path to being from) is specified then the default is to begin from the current (.) directory.

Feeling glad now?