LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 04-03-2011, 07:18 PM   #1
Wocky
LQ Newbie
 
Registered: Oct 2004
Location: Australia
Posts: 29

Rep: Reputation: 1
Duplicated filenames in the same directory


The (WD 320GB) drive has a single ext3 FS on it. It has had some problems in the past, but all were fixed with fsck -y. Now there are several directories with duplicate filenames. The files with duplicated names are hard links of each other, but the names are identical. I've run several diagnostics over them, looking for, eg, non-printing characters in the name, but they are completely identical. Here are some examples:

Code:
$ ls -l | awk '$2 == 2{print}'
-rw-------    2 Wocky    users    27091695 Dec 17 22:51 MaliceAforethought.2010.12.19.mp3
-rw-------    2 Wocky    users    27091695 Dec 17 22:51 MaliceAforethought.2010.12.19.mp3
-rw-------    2 Wocky    users    26109909 Dec  1 12:04 RoundOnAWellKnownTheme.2010.12.05.mp3
-rw-------    2 Wocky    users    26109909 Dec  1 12:04 RoundOnAWellKnownTheme.2010.12.05.mp3
-rw-------    2 Wocky    users    27118862 Dec 10 11:51 ItsAGoodDay.2010.12.12.mp3
-rw-------    2 Wocky    users    27118862 Dec 10 11:51 ItsAGoodDay.2010.12.12.mp3
-rw-------    2 Wocky    users    27132655 Dec 13 22:48 AfterYouveGone.2010.12.12.mp3
-rw-------    2 Wocky    users    27132655 Dec 13 22:48 AfterYouveGone.2010.12.12.mp3
...
$ ls -l | awk '$2 == 2{print}' | wc -l
338
$ ls -l | awk '$2 == 2{print}' | sort | uniq | wc -l
169
$
These are (obviously) from a directory of mp3s, but similar duplications occur throughout the fs - there are several thousand files affected. Some of the diagnostics were programmes I wrote that accessed the directory itself (through the dirent structure).

I always thought duplicate filenames in the same directory were impossible in unix/linux; this appears to prove me wrong. Am I missing something?

(Kernel version 2.4.20 with xfs extensions. The installation was originally Red Hat 7, but I've changed almost everything, so it's probably more accurate to call it a custom distro.)

Last edited by Wocky; 04-03-2011 at 07:20 PM.
 
Old 04-03-2011, 08:38 PM   #2
kbp
Senior Member
 
Registered: Aug 2009
Posts: 3,758

Rep: Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643
I know you said that you've checked but could you please post the output of 'ls -liq <dir>' where <dir> contains one of these duplicates ? .. your output doesn't show the full paths, can you confirm the paths are identical as well ?
 
Old 04-03-2011, 08:49 PM   #3
carltm
Member
 
Registered: Jan 2007
Location: Canton, MI
Distribution: CentOS, SuSE, Red Hat, Debian, etc.
Posts: 697

Rep: Reputation: 93
It's possible for a filename to have a non-printing character,
such as ^H. If you had files named "file1.txt" and "file12^H.txt"
and typed "ls" it would appear that you have duplicates.

Use "ls | cat -vt" to see any non-printing characters.
 
Old 04-03-2011, 08:57 PM   #4
Wocky
LQ Newbie
 
Registered: Oct 2004
Location: Australia
Posts: 29

Original Poster
Rep: Reputation: 1
Quote:
Originally Posted by kbp View Post
I know you said that you've checked but could you please post the output of 'ls -liq <dir>' where <dir> contains one of these duplicates ? .. your output doesn't show the full paths, can you confirm the paths are identical as well ?
Certainly.

Code:
$ ls -liq . | awk '$3 == 2{print}'
     356 drwx------    2 Wocky    users       86016 Mar 11 13:01 .
30736822 -rw-------    2 Wocky    users    27132655 Dec 13 22:48 AfterYouveGone.2010.12.12.mp3
30736822 -rw-------    2 Wocky    users    27132655 Dec 13 22:48 AfterYouveGone.2010.12.12.mp3
30736792 -rw-------    2 Wocky    users    27118862 Dec 10 11:51 ItsAGoodDay.2010.12.12.mp3
30736792 -rw-------    2 Wocky    users    27118862 Dec 10 11:51 ItsAGoodDay.2010.12.12.mp3
28754138 -rw-------    2 Wocky    users    27091695 Dec 17 22:51 MaliceAforethought.2010.12.19.mp3
28754138 -rw-------    2 Wocky    users    27091695 Dec 17 22:51 MaliceAforethought.2010.12.19.mp3
28852283 -rw-------    2 Wocky    users    26109909 Dec  1 12:04 RoundOnAWellKnownTheme.2010.12.05.mp3
28852283 -rw-------    2 Wocky    users    26109909 Dec  1 12:04 RoundOnAWellKnownTheme.2010.12.05.mp3
...
(This is only the first few; as indicated in the OP, there're more than 300 files. I can post the whole lot, but it's really just more of the same.)

Thanks
Wocky

Last edited by Wocky; 04-03-2011 at 09:29 PM.
 
Old 04-03-2011, 09:03 PM   #5
Wocky
LQ Newbie
 
Registered: Oct 2004
Location: Australia
Posts: 29

Original Poster
Rep: Reputation: 1
Quote:
Originally Posted by carltm View Post
It's possible for a filename to have a non-printing character,
such as ^H. If you had files named "file1.txt" and "file12^H.txt"
and typed "ls" it would appear that you have duplicates.

Use "ls | cat -vt" to see any non-printing characters.
Thanks carltm. "ls -l | cat -vt" gives the same output as before. I've also tried "ls -liq", which prints the inode (-i) and quotes non-printing characters (-q).

Wocky
 
Old 04-03-2011, 09:08 PM   #6
Wocky
LQ Newbie
 
Registered: Oct 2004
Location: Australia
Posts: 29

Original Poster
Rep: Reputation: 1
Quote:
Originally Posted by kbp View Post
I know you said that you've checked but could you please post the output of 'ls -liq <dir>' where <dir> contains one of these duplicates ? .. your output doesn't show the full paths, can you confirm the paths are identical as well ?

Sorry, kbp, I've just re-read that. Here it is again:

Code:
$ ls -liq /home/Wocky/temp_audio/mp3 | awk '$3 == 2{print}'
30736822 -rw-------    2 Wocky    users    27132655 Dec 13 22:48 AfterYouveGone.2010.12.12.mp3
30736822 -rw-------    2 Wocky    users    27132655 Dec 13 22:48 AfterYouveGone.2010.12.12.mp3
30736792 -rw-------    2 Wocky    users    27118862 Dec 10 11:51 ItsAGoodDay.2010.12.12.mp3
30736792 -rw-------    2 Wocky    users    27118862 Dec 10 11:51 ItsAGoodDay.2010.12.12.mp3
28754138 -rw-------    2 Wocky    users    27091695 Dec 17 22:51 MaliceAforethought.2010.12.19.mp3
28754138 -rw-------    2 Wocky    users    27091695 Dec 17 22:51 MaliceAforethought.2010.12.19.mp3
28852283 -rw-------    2 Wocky    users    26109909 Dec  1 12:04 RoundOnAWellKnownTheme.2010.12.05.mp3
28852283 -rw-------    2 Wocky    users    26109909 Dec  1 12:04 RoundOnAWellKnownTheme.2010.12.05.mp3
...
Wocky

Last edited by Wocky; 04-03-2011 at 09:29 PM.
 
Old 04-03-2011, 09:21 PM   #7
carltm
Member
 
Registered: Jan 2007
Location: Canton, MI
Distribution: CentOS, SuSE, Red Hat, Debian, etc.
Posts: 697

Rep: Reputation: 93
This is odd. I would umount the filesystem and run fsck -y again.
 
Old 04-03-2011, 09:33 PM   #8
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 655Reputation: 655Reputation: 655Reputation: 655Reputation: 655Reputation: 655
The 2 is the link count to the same file.

If the inode numbers were different, they would be different files with identical names and you could use "find -inum <#> -exec mv '{}' dir/ \;" to process them.

If fsck won't correct the directory (they are files to the kernel).
One thing you might try is something like:
makedir dupes
find ./ -maxdepth 1 -links 2 -type f -exec '{}' dupes/ \;
mv dupes/* .
rmdir dupes

This will copy files with a link count of 2 in the current directory to the dupes/ subdirectory. Since they have the same name, you should end up with one file name in the subdirectory.

Maybe backup a couple of these to another directory for testing. To make sure it works. I can't test it since I can't create 2 links to the same file with the same name.

Last edited by jschiwal; 04-03-2011 at 09:35 PM.
 
  


Reply

Tags
filesystem


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Duplicated directory on ext3 filesystem Wocky Linux - Hardware 1 07-11-2010 09:06 AM
same filenames in a directory with different case venkat_eg Linux - General 2 08-15-2009 05:16 AM
home directory duplicated eldondehart Linux - Desktop 18 12-12-2008 02:59 PM
Directory Tree with Unicode Filenames levan_k1 Linux - Software 0 10-17-2007 03:27 PM
Making all the filenames in a directory lowercase? minm Linux - Newbie 4 12-24-2004 01:50 AM


All times are GMT -5. The time now is 06:35 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration