LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 07-13-2007, 11:44 PM   #1
ordinary
Member
 
Registered: Apr 2007
Location: the Rocket City
Distribution: Debian, Ubuntu, CentOS; in days past Fedora, Solaris, SunOS, 4.2BSD, 4.3BSD, SVR4, AIX, HP-UX
Posts: 101

Rep: Reputation: Disabled
Non-ascii characters make files/filenames inaccessable?


I have a Fedora Core 6 box with an ext3 filesystem that contains music files. The file names come from tags and ultimately from FreeDB or Gracenote. In any case, several filenames contain acute accents or umlauts or other non-ascii characters, and some of these files inaccessable to bash or Nautilus.

In particular, trying to rename these files results in something like:

Code:
[phil@frederic wmas]$ mv "Antonin Dvorák" Antonin\ Dvorak
mv: cannot stat `Antonin Dvorák': No such file or directory
[phil@frederic wmas]$ mv 'Antonin Dvorák' Antonin\ Dvorak
mv: cannot stat `Antonin Dvorák': No such file or directory
[phil@frederic wmas]$ mv Antonin\ Dvorák Antonin\ Dvorak
mv: cannot stat `Antonin Dvorák': No such file or directory
[phil@frederic wmas]$ ls Antonin\ Dvorák 
ls: Antonin Dvorák: No such file or directory
[phil@frederic wmas]$
I used bash filename completion to enter the filenames, so I know they're there; I mean I didn't mistype them. Further, find finds them, then says they don't exist.

Running find in this filesystem results in:
Code:
[phil@frederic all_formats]$ find . -name stuff -print
find: ./wmas/José-Luis Garcia, Anthony Halstead, Leonard Slatkin; English Chamber Orchestra: No such file or directory
find: ./wmas/Frédéric Chopin: No such file or directory
find: ./wmas/Antonin Dvorák: No such file or directory
find: ./oggs/Georg Friedrich Händel: No such file or directory
find: ./oggs/Schönhertz & Scott: No such file or directory
[phil@frederic all_formats]$
Even ls finds the filenames but doesn't yield meaningful results:

Code:
[phil@frederic wmas]$ ls -liN
 9421871 drwxr-xr-x  3 phil phil 4096 Jan  2  2007 Academy Of St. Martin In The Fields Chamber Ensemble
10962514 drwxr-xr-x  3 phil phil 4096 Jan  2  2007 Adderly, Cannonball
10
<snip>
10962506 drwxr-xr-x  3 phil phil 4096 Jan  2  2007 Alison Krauss & Union Station
       ? ?---------  ? ?    ?       ?            ? Antonin Dvorák
10962880 drwxr-xr-x 13 phil phil 4096 Jan  2  2007 Antonio Vivaldi
<snip>
 9423690 drwxr-xr-x  3 phil phil 4096 Mar 19 20:32 Freddie Jackson
       ? ?---------  ? ?    ?       ?            ? Frédéric Chopin
 9423253 drwxr-xr-x  3 phil phil 4096 Mar 19 20:32 Frederic Munoz, Dante Andreo; Grupo Vocal Gregor
<snip>
10961054 drwxr-xr-x  3 phil phil 4096 Mar 19 20:33 Johnny Paycheck
10961677 drwxr-xr-x  3 phil phil 4096 Mar 19 20:32 Johnny VanZant
       ? ?---------  ? ?    ?       ?            ? José-Luis Garcia, Anthony Halstead, Leonard Slatkin; English Chamber Orchestra
10962934 drwxr-xr-x  3 phil phil 4096 Mar 19 20:32 Josie Kreuzer-Hot Rod Lincoln
<snip>
What's going on? How can I rename (or even remove) these stubborn files?

Code:
[phil@frederic ~]$ uname -a
Linux frederic 2.6.20-1.2944.fc6xen #1 SMP Tue Apr 10 18:03:37 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
[phil@frederic ~]$
Thanks,
Phil

Last edited by ordinary; 07-13-2007 at 11:46 PM.
 
Old 07-14-2007, 12:25 AM   #2
macemoneta
Senior Member
 
Registered: Jan 2005
Location: Manalapan, NJ
Distribution: Fedora x86 and x86_64, Debian PPC and ARM, Android
Posts: 4,593
Blog Entries: 2

Rep: Reputation: 335Reputation: 335Reputation: 335Reputation: 335
I don't think your problem has anything to do with the special characters. In FC6:

Code:
$ touch test
$ mv test "Antonin Dvorák"
$ ls -l Antonin\ Dvorák
-rw-rw-r-- 1 mmoneta mmoneta 0 Jul 14 00:22 Antonin Dvorák
$ rm Antonin\ Dvorák
From your result, it looks like your filesystem is corrupted. Boot the rescue cd, don't mount the filesystem, then fsck the drive.
 
Old 07-14-2007, 12:50 AM   #3
Quakeboy02
Senior Member
 
Registered: Nov 2006
Distribution: Debian Linux 9 (stretch)
Posts: 3,358

Rep: Reputation: 126Reputation: 126
I had something similar to this recently as a result of a mass copy using "cp -axu from to". For some reason, as macemoneta suggests, my drive got corrupted. fsck took care of it, but it was aggravating, and I never did figure out what really caused it.
 
Old 07-16-2007, 02:40 PM   #4
ordinary
Member
 
Registered: Apr 2007
Location: the Rocket City
Distribution: Debian, Ubuntu, CentOS; in days past Fedora, Solaris, SunOS, 4.2BSD, 4.3BSD, SVR4, AIX, HP-UX
Posts: 101

Original Poster
Rep: Reputation: Disabled
I think you are both right. I can touch "Antonin Dvorák", ls, mv, cp and it works as expected. My filesystem seems to be corrupt somehow. Interestingly, it seems to affect only, but not necessarily all, files with non ascii characters in the name.

This filesystem resides on an external USB drive, so I can move it from system to system easily. Right now its on an Ubuntu box.

As suggested, I tried fsck with the options that looked promising:
Code:
phil@selma:~$ sudo fsck -V /dev/sdb2 -f
fsck 1.40-WIP (14-Nov-2006)
[/sbin/fsck.ext3 (1) -- /dev/sdb2] fsck.ext3 -f /dev/sdb2
e2fsck 1.40-WIP (14-Nov-2006)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Problem in HTREE directory inode 2588675: node (2) has bad min hash
Invalid HTREE directory inode 2588675 (/all_formats/mp3s).  Clear<y>? yes

Problem in HTREE directory inode 2592776: node (4) has bad min hash
Problem in HTREE directory inode 2592776: node (5) has bad min hash
Problem in HTREE directory inode 2592776: node (7) has bad max hash
Invalid HTREE directory inode 2592776 (/all_formats/oggs).  Clear<y>? yes

Problem in HTREE directory inode 6000105: node (2) has bad min hash
Problem in HTREE directory inode 6000105: node (4) has bad min hash
Invalid HTREE directory inode 6000105 (/all_formats/wmas).  Clear<y>? yes

Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/sdb2: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sdb2: 22683/24133632 files (10.4% non-contiguous), 25879068/48249219 blocks
phil@selma:~$
Then I remounted the disk and had the same problems. I unmounted and ran
Code:
phil@selma:~$ sudo fsck -V /dev/sdb2 -f -p -v
Password:
fsck 1.40-WIP (14-Nov-2006)
[/sbin/fsck.ext3 (1) -- /dev/sdb2] fsck.ext3 -f -p -v /dev/sdb2

   22683 inodes used (0.09%)
    2357 non-contiguous inodes (10.4%)
         # of inodes with ind/dind/tind blocks: 17917/9772/0
25879068 blocks used (53.64%)
       0 bad blocks
       1 large file

   18216 regular files
    4458 directories
       0 character device files
       0 block device files
       0 fifos
       0 links
       0 symbolic links (0 fast symbolic links)
       0 sockets
--------
   22674 files
phil@selma:
So that looked okay, but my problem still existed, so I tried:

Code:
phil@selma:~$ sudo fsck -V /dev/sdb2 -f -c -c -v
Password:
fsck 1.40-WIP (14-Nov-2006)
[/sbin/fsck.ext3 (1) -- /dev/sdb2] fsck.ext3 -f -c -c -v /dev/sdb2
e2fsck 1.40-WIP (14-Nov-2006)
Checking for bad blocks (non-destructive read-write test)
Testing with random pattern: ^X        8442496/       48249218
Interrupt caught, cleaning up
phil@selma:
But, because the filesystem is on an external USB disk, that was going to take a week and a half, so I stopped it.

What options are recomended? Other thoughts?

In any case, the problems persist.

I appreciate the interest and further suggestions are welcome. Re-ripping that music is not a pleasant prospect.
 
Old 07-16-2007, 03:09 PM   #5
macemoneta
Senior Member
 
Registered: Jan 2005
Location: Manalapan, NJ
Distribution: Fedora x86 and x86_64, Debian PPC and ARM, Android
Posts: 4,593
Blog Entries: 2

Rep: Reputation: 335Reputation: 335Reputation: 335Reputation: 335
Copy off what you can, reformat, copy back. Essentially, treat it as if the hard drive was failing and you had a little notice.
 
Old 07-16-2007, 06:08 PM   #6
ordinary
Member
 
Registered: Apr 2007
Location: the Rocket City
Distribution: Debian, Ubuntu, CentOS; in days past Fedora, Solaris, SunOS, 4.2BSD, 4.3BSD, SVR4, AIX, HP-UX
Posts: 101

Original Poster
Rep: Reputation: Disabled
Yeah, macemoneta, I tend to agree. I was almost hoping someone would say that.

For the record, another directory name is being reported as nonexistent. There is a directory there called "Green Jell˙", and it wasn't part of the problem. Now it is. It is just too odd that ONLY files with non-ascii names are affected, but even those are not affected uniformly.

Before I quit I'm going to pull out od(1) and debugfs(8) and root* around some. I've got one little theory that I'd like to check out. These filenames didn't come from my keyboard. They came from audio tags which came from FreeDB and Gracenote which came from who knows where. There may be characters which render in gnome-terminal, but which are unsuitable for filenames.

If I find anything interesting, I'll post it here.

These files are backed up, but I'm afraid the corrupt filenames may have found their way into the backup.

Later,
Phil


* sorry
 
Old 07-18-2007, 02:16 PM   #7
ordinary
Member
 
Registered: Apr 2007
Location: the Rocket City
Distribution: Debian, Ubuntu, CentOS; in days past Fedora, Solaris, SunOS, 4.2BSD, 4.3BSD, SVR4, AIX, HP-UX
Posts: 101

Original Poster
Rep: Reputation: Disabled
Well, I gave up. From my reading, UTF-8 characters should work fine in filenames. I don't understand my problems at all.

Fortunately, most of the affected stuff was backed up, and I am restoring it. My backup was similarly affected with non-ascii (really non-latin1) characters, but I'm hoping that the restore will be just as mule-headed as everything else was about those file names, and therefore will just report errors and not propagate the bad names back to the new filesystem. If that works, I will reconstitute the backup area, and be entirely rid of non-latin1 names. (I do a naive backup of my music files with cp -au, I'll restore by cp -a to a newly created filesystem.)

I did manage to educate myself on Unicode characters and UTF-8 encodings. I'm a little late to that party, I guess, but it has never been an issue for me. My specialties are modeling and simulation and embedded systems. I use Linux as a development and target platform for my simulation work, and as a development platform for my embedded work. Oh yeah, and as my all purpose OS at home.

I also learned a good bit about ext2 and ext3 filesystems and debugfs (the tool, not the filesystem). Debugfs is obstinate about UTF-8 characters, but it is a fun and seemingly very dangerous program. I love Unix/Linux. You want to hang yourself, Unix will happily hand you the rope. Truly a friend indeed. I'm amazed that I did no damage to my filesystem with debugfs.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Non ASCII (language specific) characters in filenames? milicic.marko Linux - General 2 03-19-2005 06:15 PM
copy files and make all filenames uppercase pas Linux - Newbie 1 12-20-2004 12:39 AM
How to detect non ascii filenames from an application which doesn't support UNICODE pankajtakawale Solaris / OpenSolaris 0 02-05-2004 07:31 AM
How to detect non ascii filenames from an application which doesn't support UNICODE ( pankajtakawale Solaris / OpenSolaris 0 02-05-2004 07:28 AM
ascii characters lakshman Linux - General 1 03-14-2003 12:28 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 04:02 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration