LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   File name containing Danish / Norwegian characters being renamed automatically. (https://www.linuxquestions.org/questions/linux-general-1/file-name-containing-danish-norwegian-characters-being-renamed-automatically-906607/)

aronnok 10-05-2011 12:36 PM

File name containing Danish / Norwegian characters being renamed automatically.
 
Hi,
I have a Linux FTP server (Red Hat Enterprise Linux Server release 5.6 (Tikanga)).
Proftpd version : proftpd-1.3.3c-1.el5
Installed with mysql support.
This is also configured with mod_lang

We have some Scandinavian clients. Sometimes they upload files with special characters like "Ø" "Æ" etc.. We have found that when the clients upload files with special characters, it's getting automatically renamed. We have identified that "Ø" is being renamed to "_"

I have my first proftpd.conf file at /etc/proftpd.conf
This file basically calls other individual config files. I have 3 FTP servers configured there.

I have the following parameter to support languages "UseEncoding on"

# grep Encoding /etc/proftpd* -R
/etc/proftpd/ifas_swe.conf:UseEncoding on
/etc/proftpd/ifas_no.conf:UseEncoding on
/etc/proftpd/ifas_swe.conf.old:UseEncoding off
/etc/proftpd/ifas_dk.conf:UseEncoding on
/etc/proftpd.conf:UseEncoding off
(last one is off as a tryout)

Funny thing is, i can see the special characters in the FTP client softwares. But when I access the directory through terminal, everything changes.

For a test I have uploaded two files named "ÆpiÅcØ 3.jpg" & "ŒŒpÖiØc ØØ4.jpg" and in the ftp server through terminal I can see them as "_pi_c_ 3.jpg" & "__p_i_c __4.jpg".

Just to provide more information:

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=


How do I make sure that the files with Special Characters will not be renamed so that my clients are not harmed.

Do I need to change the Environment Directives like the Language settings?

Any help in this regard would be highly appreciated.

Guttorm 10-06-2011 09:29 AM

Hi

In Scandinavia, most people's computers use UTF-8. But older computer systems use ISO-8859-1. All the letters are in that encoding as well, and we used it before Unicode became common. There is nothing special about Scandinavia in this regard. For example German, French and Spanish all have a few extra letters. If you search on the problem, you'll get more answers if you use any of these languages.

I don't know much about proftp, but I searched it and found this page:

http://www.proftpd.org/docs/modules/mod_lang.html

As I understand it, if you use "UseEncoding on", you deny the FTP client the possibility to change it. If the client then sends them as ISO-8859-1, the result is incorrect UTF-8, which might the the problem.

I would try to not use "UseEncoding" at all. In the docs, its says "By default, the mod_lang will automatically discover the local character set, and will use UTF8 for the client character set. The module will also allow the use of UTF8 encoding to be changed by clients using the OPTS UTF8 command (as per RFC2640)".


All times are GMT -5. The time now is 05:07 AM.