LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Getting extended ascii (chars > 127) to work in Linux (https://www.linuxquestions.org/questions/linux-newbie-8/getting-extended-ascii-chars-127-to-work-in-linux-866427/)

Genisys 03-04-2011 09:40 AM

Getting extended ascii (chars > 127) to work in Linux
 
I have an application where the client is written in Visual Studio (C#), run on PCs, and the server end has traditionally been SCO. We're now migrating to Linux. I can, for example, input "Test Ñ This" in a text box on the client, and when the server end is SCO, it is able to 'accept' the Ñ character sent to it from the client. When I try this same example on Linux, that Ñ character (hex D1) does not 'make it' from the client to the server. (Please forgive my layman's language).

The problem is not on the client, and I have verified that the telnet connection is in fact passing these extended characters, but they are not recognized properly by the Linux server.

In researching this, I've played with setting the LANG environmental variable from LANG=en_US.UTF-8 to several of the other possible values found in /usr/lib/locale, for a european locale (the end user is actually in Spain), and these 'euro' characters are still not handled properly in my application.

Would anyone be able to point me to any specific env variable settings, and/or anything else that would resolve this issue?

tronayne 03-04-2011 10:20 AM

That sounds like you don't have a Unicode locale turned on system-wide.

In /etc/login.defs/lang.sh (the name and location may vary depending upon your distribution), there should be a setting to enable en_UTF-8 (you would comment-out the setting for LANG=en_US or, possibly, LANG=en_US.ISO8859-1 and un-comment LANG=en_US.UTF-8).

If you don't have /etc/login.defs/lang.sh, try
Code:

prompt: cd /etc
prompt: find . -type f -print | xargs grep -il "en_UTF-8"

to find the correct file.

You'll probably need to log out and log back again for any change to take effect (the login program has to be executed to make it happen -- if you're running a GUI login-thingy you may want to reboot).

Hope this helps some.

Genisys 03-04-2011 10:44 AM

troayne - I do see in /etc/profile.d/lang.sh that the LANG env variable is being set to what you suggest. A "locale" output gives me:

LANG=en_US.UTF-8
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=C

I've also tried "es_ES.utf8" and "es_ES@euro" for LANG, and this doesn't change anything from what I can tell. I've also tried setting the LC_ALL to these two, plus the "en_US.UTF-8", and that still does not make a difference. Here's the output from "set" in case this is useful:

ABTERM=fwycol
BASH=/bin/sh
BASH_ARGC=()
BASH_ARGV=()
BASH_LINENO=()
BASH_SOURCE=()
BASH_VERSINFO=([0]="3" [1]="2" [2]="25" [3]="1" [4]="release" [5]="i686-redhat-linux-gnu")
BASH_VERSION='3.2.25(1)-release'
COLS=80
COLUMNS=78
CVS_RSH=ssh
DIRSTACK=()
EUID=0
EXMENU_PID=15242
EXMENU_SHELL=ABLOG
F10_ABORT=Y
F10_EXIT=Y
FLIP_BREAK='^C'
FLIP_CLIENT=root
FLIP_ESC_DELAY=200
FLIP_SHOWTIME=Y
FLPCTL_2SEC=Y
FLPCTL_ABORT=Y
FLPCTL_CHAN=1
FLPCTL_PID=15238
FLPCTL_TTY=/dev/pts/6
FLPCTL_UNIQUE=N
GINDTA_IDX=0
GROUPS=()
G_BROKEN_FILENAMES=1
HISTSIZE=1000
HOME=/root
HOSTNAME=show.genisys.com
HOSTTYPE=i686
IFS=$' \t\n'
INPUTRC=/etc/inputrc
LANG=en_US.UTF-8
LC_ALL=C
LESSOPEN='|/usr/bin/lesspipe.sh %s'
LINES=24
LINUX_SYSTEM=Y
LOGNAME=root
LS_COLORS=
MACHTYPE=i686-redhat-linux-gnu
MAIL=/var/spool/mail/root
METBIN=/metbin
MET_FAST_RPT_NAMES=N
MET_OPR_COUNT=Y
NOLOKSER=TRUE
OPTERR=1
OPTIND=1
OSTYPE=linux-gnu
PATH=/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:.:/metbin:/usr/met:/usr/met/sy:/usr/met/sy/bin:/usr/met/sy/vics_cmd:/etc:/root/bin:/usr/local:/met/sy/bin:/met/sy/cmd
PERL5LIB=/usr/local/lib/perl5
POSIXLY_CORRECT=y
PPID=15242
PS4='+ '
PWD=/etc/profile.d
RUNPATH=./:../:/met/sy:/met/sy/../ss
SCHEMA_SPEED=Y
SHELL=/bin/bash
SHELLOPTS=braceexpand:hashall:interactive-comments:posix
SHLVL=2
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
SSH_CLIENT='192.168.10.185 50971 22'
SSH_CONNECTION='192.168.10.185 50971 192.168.10.65 22'
SSH_TTY=/dev/pts/6
SY_LOC=/met/sy
TERM=fwycol_f
TY=page
UID=0
USC_STYLE=VICS
USER=root
UV_STYLE=VICS
_=/metbin/ablog

Any other suggestions on what to try? (I greatly appreciate whatever feedback is given here. Thank you all very much!)

fcintron 03-04-2011 01:32 PM

how is your app sending this data to linux server?

I have an app made in VB .net and my server is Centos 5.5 and I can send ñÑáéíóú without problem


Regards

tronayne 03-04-2011 01:41 PM

Did you go all the way out of X to the console, exit and log back in -- you really have to do that (or reboot the system) for the change to be available -- it will show up in set or locale but won't be in effect until you get all the way out to a console, log out and back in (or reboot if you're using a GUI login).

If you go to Dugan Chen's web site (http://www.vcn.bc.ca/~dugan/slackware-fonts/) and download his test file, ucs-fonts.tar.gz, then
  1. create a directory; e.g., unicode
  2. cd unicode
  3. tax xvf ../ucs-fonts.tar.gz
  4. cd examples
  5. cat quickbrown.txt
That will show you that you have, in fact, got your environment settings correctly set.

Hope this helps some.

Genisys 03-04-2011 03:52 PM

(I do get all the way out, or reboot the Linux machine when I'm testing changes). I followed the instructions above from tronayne, and when I cat out the file, in the Spanish section I see:

Spanish (es)
------------

El pingC<ino Wenceslao hizo kilC3metros bajo exhaustiva lluvia y
frC-o, aC1oraba a su querido cachorro.

So this is telling me that I don't have something setup correctly, I'm assuming. So thanks for that confirmation. In further testing and analysis, I can now effectively receive these extended characters from the client to the server. So I'm making progress, and it points again to the server. But, I still have issues going from the server to the client. I'm using simple standard output to send text from the server to the client, via telnet. And my guess is that if catting out that quickbrown.txt test file could yield success, it'd probably resolve my issue. The only environmental variables that I can think of to share is:

LANG=en_US.UTF-8
MACHTYPE=i686-redhad-linux-gnu

Apologies to all here for being ignorant, but I simply can't put my finger on this. Anyone care to share their ENV variables (from 'set' and 'locale') in case that'd help? Any other suggestions? Thanks to all who have replied. I really appreciate it.

tronayne 03-04-2011 04:26 PM

Telnet? Doesn't that strip out the 8th bit or something like that? Is there a setting to make it do binary, or could you use ssh (with PuTTY at http://www.chiark.greenend.org.uk/~sgtatham/putty on the Windows side and SSH on the Linux side).

Just for grins, on my system locale shows the following and the entire quickbrown.txt file displays every language in the file correctly.
Code:

locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Hope this helps some.


All times are GMT -5. The time now is 08:45 AM.