LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Desktop (https://www.linuxquestions.org/questions/linux-desktop-74/)
-   -   multilanguage filename characters issue (https://www.linuxquestions.org/questions/linux-desktop-74/multilanguage-filename-characters-issue-581236/)

ovidnet 08-31-2007 09:53 AM

multilanguage filename characters issue
 
I have a Slackware linux used as a Desktop.
I often use this machine to mount windows partitions (fat32, ntfs) and copy files from. Scenario is i copy files from damaged windows instalations, and source files can be in different languages, typical in french.

For example, the original file is
Mes fichiers reçus
and after copy became
Mes fichiers reçus

I want to keep english instalaton of slackware (Bluewhite64)
The question is how can I control charset for filenames, and if I can add multilanguage support, or if I can't, how can I switch settings for system. I don't want to use a conversion tool, in want to use system settings.

(for copy I used cp or F5 in MC)

http://www.linuxquestions.org/questi...d.php?t=382914 is a related post, but I'm not satisfied by response.

Samotnik 08-31-2007 03:25 PM

The only way to use filenames from different codepages is naming them in utf8 codeset. You should install a utf8 locale - something like en_US.UTF8 - to do it.

ovidnet 08-31-2007 09:29 PM

Quote:

Originally Posted by Samotnik (Post 2877147)
You should install a utf8 locale - something like en_US.UTF8 - to do it.

ok, how to do this ?
where is located packages with locales ?
how can I switch locales ?

ovidnet 10-10-2007 02:38 PM

i use export locales fr_CA.UTF8 but is working only in X (KDE) not in command line (mc)

Su-Shee 10-10-2007 03:10 PM

Slackware has already got included all Unicode locales necessary.

I mix certain german locale settings with english settings this way - just exchange de_DE to ca_FR:

export LC_CTYPE="de_DE.utf-8"
export LC_COLLATE="de_DE.utf-8"
export LANG=en_US.utf-8
export LC_PAPER="de_DE.utf-8"

(mutt, less, vim require own settings but support Unicode perfectly fine.)

This leads to the result that I still get english messages and error messages and things like that, but a german alphabetical correct sort order and a german character set (no accents but umlauts) all encoded in UTF-8 Unicode.

Of course, I have to make sure my X font supports all this.

Under the command line without X, you'll have to use a different setting which is mostly a font issue.

Anyhow, I usally don't use unicode on the console, but mc within an xterm which supports Unicode does work properly, moves and deletes files and so on. I've just tried some random chinese filename.

After a short Google, I'd say "man setfont" might help you with the console.

In your case, you might have another problem: Windows encodes Unicode not in UTF-8 but in UTF-forgotwhichone and its files contain possibly a so callod BOM (byte order marker) to indicate some Unicode stuff I usally ignore, because Linux uses UTF-8. In that case you'll have to convert the file with iconv or another tool.


All times are GMT -5. The time now is 11:11 PM.