LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 03-11-2008, 08:54 AM   #1
exscape
Member
 
Registered: Aug 2007
Location: Sweden
Distribution: OS X, Gentoo, FreeBSD
Posts: 82

Rep: Reputation: 15
Perl ignoring my locale?


It doesn't give me any errors, but it seems to ignore the locale completely. If I set LC_ALL to something invalid, it does complain as expected; so I suppose it does read the locale successfully.

Here's the code:
Code:
use locale;
use warnings;
use strict;

# Print LC_*
print "$_ = $ENV{$_}\n" for grep /^LC/, keys %ENV;
print "\n";

my @arr = qw/ett gng ord i uppercase  test /;

for (@arr)
{
	print "nonword: $_; uppercase: " . "\U$_\n" if !/^\w+$/;
	print "word: $_; uppercase: " . "\U$_\n" if //;
}
Here's how I run it:
$ export LC_ALL="sv_SE.utf8"; LC_CTYPE="${LC_ALL}" perl test.pl
(I know that's redundant)

Output:
Quote:
LC_ALL = sv_SE.utf8
LC_CTYPE = sv_SE.utf8

word: ett; uppercase: ETT
nonword: gng; uppercase: GNG
word: ord; uppercase: ORD
word: i; uppercase: I
word: uppercase; uppercase: UPPERCASE
nonword: ; uppercase:
nonword: test; uppercase: TEST
nonword: ; uppercase:
Expected output:
Quote:
LC_ALL = sv_SE.utf8
LC_CTYPE = sv_SE.utf8

word: ett; uppercase: ETT
word: gng; uppercase: GNG
word: ord; uppercase: ORD
word: i; uppercase: I
word: uppercase; uppercase: UPPERCASE
word: ; uppercase:
word: test; uppercase: TEST
word: ; uppercase:
Any advice?
BTW, the locale works in the shell.
 
Old 03-11-2008, 02:12 PM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,974
Blog Entries: 11

Rep: Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879
Which locale did you use when ENTERING those words in the script? Are you
sure they're valid UTF-8?


Cheers,
Tink
 
Old 03-11-2008, 02:42 PM   #3
exscape
Member
 
Registered: Aug 2007
Location: Sweden
Distribution: OS X, Gentoo, FreeBSD
Posts: 82

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by Tinkster View Post
Which locale did you use when ENTERING those words in the script? Are you
sure they're valid UTF-8?

Cheers,
Tink
Urggh! Very good question. I can't answer it at the moment, though.
Hm. I fixed /etc/profile and some other stuff, and it appears my shell is running with the correct locale, at least.
If I type "ls [garbage]" it'll tell me in swedish that the file wasn't found. bash speaks english, however, but I'm not sure I have the swedish messages for bash...? At least I couldn't find them in /usr/share/locale, but I'm not too good at this.

In any case, I rewrote the program to use @ARGV instead, same deal. Also, after re-creating the file in another editor (nano, compiled with --enable-utf8), same deal.
I hate charsets, bigtime, and always have... Grr.
In short: no, I'm not sure the chars are valid UTF-8.

iconv -c -f utf8 -t iso8859-1 test.pl
shows them as ?'s. I suppose that is bad. Or, is it normal, since my terminal is UTF-8?

Edit: Same problem on my laptop (OS X).
Quote:
(rewritten to use <>)
serenity@macbookpro ~ $ echo > test
serenity@macbookpro ~ $ hexdump -C test
00000000 c3 a5 0a |.|
00000003
serenity@macbookpro ~ $ perl locale.pl test
LC_ALL = sv_SE.UTF-8
LC_CTYPE = UTF-8

nonword: ; uppercase:
Edit: Well, guess what. I installed a brand new Ubuntu VM just to test this, SAME PROBLEM. Unless Perls locales sucks (I doubt it), it has to be my code... What's wrong with it, then? :/

Update: I fixed the uppercase thing, I had to add use encoding 'utf8'. However, they still don't match \w
ANOTHER update: It seems Perl doesn't enjoy multibyte characters... Sigh. I can choose between ISO8859-1 and giving up. Until I really need this, I'll choose the former... Unless somebody gives me "the" answer of course.

Last edited by exscape; 03-11-2008 at 05:03 PM.
 
Old 03-11-2008, 07:21 PM   #4
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,261

Rep: Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028
I'd advise asking the qn over here: www.perlmonks.org. It's where the gurus hang out eg merlyn (aka Randal Schwartz).
 
  


Reply

Tags
locale, perl


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl locale warning jafriede Slackware 4 08-23-2007 09:46 AM
perl/c locale problems sohmc Linux - Software 5 02-10-2006 08:22 AM
what RPM for perl(Locale::Messages)? c0ldshadow Fedora 3 07-08-2005 04:10 PM
perl: warning: Please check that your locale settings: Daredevil Linux - Software 1 05-12-2005 10:47 AM
perl locale error tutwabee Linux - Software 0 10-10-2004 04:22 PM


All times are GMT -5. The time now is 08:37 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration