Getting extended ascii (chars > 127) to work in Linux
I have an application where the client is written in Visual Studio (C#), run on PCs, and the server end has traditionally been SCO. We're now migrating to Linux. I can, for example, input "Test Ñ This" in a text box on the client, and when the server end is SCO, it is able to 'accept' the Ñ character sent to it from the client. When I try this same example on Linux, that Ñ character (hex D1) does not 'make it' from the client to the server. (Please forgive my layman's language).
The problem is not on the client, and I have verified that the telnet connection is in fact passing these extended characters, but they are not recognized properly by the Linux server. In researching this, I've played with setting the LANG environmental variable from LANG=en_US.UTF-8 to several of the other possible values found in /usr/lib/locale, for a european locale (the end user is actually in Spain), and these 'euro' characters are still not handled properly in my application. Would anyone be able to point me to any specific env variable settings, and/or anything else that would resolve this issue? |
That sounds like you don't have a Unicode locale turned on system-wide.
In /etc/login.defs/lang.sh (the name and location may vary depending upon your distribution), there should be a setting to enable en_UTF-8 (you would comment-out the setting for LANG=en_US or, possibly, LANG=en_US.ISO8859-1 and un-comment LANG=en_US.UTF-8). If you don't have /etc/login.defs/lang.sh, try Code:
prompt: cd /etc You'll probably need to log out and log back again for any change to take effect (the login program has to be executed to make it happen -- if you're running a GUI login-thingy you may want to reboot). Hope this helps some. |
troayne - I do see in /etc/profile.d/lang.sh that the LANG env variable is being set to what you suggest. A "locale" output gives me:
LANG=en_US.UTF-8 LC_CTYPE="C" LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_PAPER="C" LC_NAME="C" LC_ADDRESS="C" LC_TELEPHONE="C" LC_MEASUREMENT="C" LC_IDENTIFICATION="C" LC_ALL=C I've also tried "es_ES.utf8" and "es_ES@euro" for LANG, and this doesn't change anything from what I can tell. I've also tried setting the LC_ALL to these two, plus the "en_US.UTF-8", and that still does not make a difference. Here's the output from "set" in case this is useful: ABTERM=fwycol BASH=/bin/sh BASH_ARGC=() BASH_ARGV=() BASH_LINENO=() BASH_SOURCE=() BASH_VERSINFO=([0]="3" [1]="2" [2]="25" [3]="1" [4]="release" [5]="i686-redhat-linux-gnu") BASH_VERSION='3.2.25(1)-release' COLS=80 COLUMNS=78 CVS_RSH=ssh DIRSTACK=() EUID=0 EXMENU_PID=15242 EXMENU_SHELL=ABLOG F10_ABORT=Y F10_EXIT=Y FLIP_BREAK='^C' FLIP_CLIENT=root FLIP_ESC_DELAY=200 FLIP_SHOWTIME=Y FLPCTL_2SEC=Y FLPCTL_ABORT=Y FLPCTL_CHAN=1 FLPCTL_PID=15238 FLPCTL_TTY=/dev/pts/6 FLPCTL_UNIQUE=N GINDTA_IDX=0 GROUPS=() G_BROKEN_FILENAMES=1 HISTSIZE=1000 HOME=/root HOSTNAME=show.genisys.com HOSTTYPE=i686 IFS=$' \t\n' INPUTRC=/etc/inputrc LANG=en_US.UTF-8 LC_ALL=C LESSOPEN='|/usr/bin/lesspipe.sh %s' LINES=24 LINUX_SYSTEM=Y LOGNAME=root LS_COLORS= MACHTYPE=i686-redhat-linux-gnu MAIL=/var/spool/mail/root METBIN=/metbin MET_FAST_RPT_NAMES=N MET_OPR_COUNT=Y NOLOKSER=TRUE OPTERR=1 OPTIND=1 OSTYPE=linux-gnu PATH=/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:.:/metbin:/usr/met:/usr/met/sy:/usr/met/sy/bin:/usr/met/sy/vics_cmd:/etc:/root/bin:/usr/local:/met/sy/bin:/met/sy/cmd PERL5LIB=/usr/local/lib/perl5 POSIXLY_CORRECT=y PPID=15242 PS4='+ ' PWD=/etc/profile.d RUNPATH=./:../:/met/sy:/met/sy/../ss SCHEMA_SPEED=Y SHELL=/bin/bash SHELLOPTS=braceexpand:hashall:interactive-comments:posix SHLVL=2 SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass SSH_CLIENT='192.168.10.185 50971 22' SSH_CONNECTION='192.168.10.185 50971 192.168.10.65 22' SSH_TTY=/dev/pts/6 SY_LOC=/met/sy TERM=fwycol_f TY=page UID=0 USC_STYLE=VICS USER=root UV_STYLE=VICS _=/metbin/ablog Any other suggestions on what to try? (I greatly appreciate whatever feedback is given here. Thank you all very much!) |
how is your app sending this data to linux server?
I have an app made in VB .net and my server is Centos 5.5 and I can send ñÑáéíóú without problem Regards |
Did you go all the way out of X to the console, exit and log back in -- you really have to do that (or reboot the system) for the change to be available -- it will show up in set or locale but won't be in effect until you get all the way out to a console, log out and back in (or reboot if you're using a GUI login).
If you go to Dugan Chen's web site (http://www.vcn.bc.ca/~dugan/slackware-fonts/) and download his test file, ucs-fonts.tar.gz, then
Hope this helps some. |
(I do get all the way out, or reboot the Linux machine when I'm testing changes). I followed the instructions above from tronayne, and when I cat out the file, in the Spanish section I see:
Spanish (es) ------------ El pingC<ino Wenceslao hizo kilC3metros bajo exhaustiva lluvia y frC-o, aC1oraba a su querido cachorro. So this is telling me that I don't have something setup correctly, I'm assuming. So thanks for that confirmation. In further testing and analysis, I can now effectively receive these extended characters from the client to the server. So I'm making progress, and it points again to the server. But, I still have issues going from the server to the client. I'm using simple standard output to send text from the server to the client, via telnet. And my guess is that if catting out that quickbrown.txt test file could yield success, it'd probably resolve my issue. The only environmental variables that I can think of to share is: LANG=en_US.UTF-8 MACHTYPE=i686-redhad-linux-gnu Apologies to all here for being ignorant, but I simply can't put my finger on this. Anyone care to share their ENV variables (from 'set' and 'locale') in case that'd help? Any other suggestions? Thanks to all who have replied. I really appreciate it. |
Telnet? Doesn't that strip out the 8th bit or something like that? Is there a setting to make it do binary, or could you use ssh (with PuTTY at http://www.chiark.greenend.org.uk/~sgtatham/putty on the Windows side and SSH on the Linux side).
Just for grins, on my system locale shows the following and the entire quickbrown.txt file displays every language in the file correctly. Code:
locale |
All times are GMT -5. The time now is 08:45 AM. |