LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 08-25-2008, 11:21 PM   #1
nadroj
Senior Member
 
Registered: Jan 2005
Location: Canada
Distribution: ubuntu
Posts: 2,539

Rep: Reputation: 58
perl UTF8 command line input


the problem is simple: how do i get perl to accept command-line arguments that are UTF8 or unicode characters?

i started using perl only recently and would say im a beginner/novice. however, i have done an extensive amount of searching and testing and cannot find a solution.

im using perl v5.8 on windows xp, however im sure if someone knows a solution for unix it will be similar in windows. in my testing and searching it seems that it may be a limitation of the console that i am using (i.e. cmd.exe); but even narrowing this down to whether it is a perl problem or console problem would be a great help.

most of what i found on the internet discusses handling UTF8 internally (ie within a perl script) or IO with files. i know how to do this but it is not what i want. i want to be able to run something like this: "perl myScript.pl ←" and have the script be able to print the argument to the console properly. the arrow is the unicode character U+2190. also the arrow isnt specifically what i want to print--i want to be able to print almost any unicode character. right now i can only get it to handle ASCII and Extended ASCII characters (integral values 1-255).

to me it seems like a console problem because as soon as i read the argument it has a value of 63, which is the ASCII value for question mark. the behaviour of cmd.exe is to print a question mark for characters it does not know. i am, however, able to store the hex value for the unicode character arrow in a perl script and have it printed, so the font i am using supports the character. again it just seems to be a problem with cmd.exe sending the character to perl.

i have tried many perl things such as: use utf8, use Encode, utf8::encode/decode, encode/decode_utf8, binmode STDIN/STDOUT ":encoding(UTF-8)". also there is a perl switch "-C"--which is supposed to make perl think command arguments are UTF-8--that i have tried which doesnt seem to help. specific to windows, i set the font to Lucida Console (the only unicode font it supports) and code page to 65001 (which is for UTF-8).

is anyone able to do this, in any operating system? i read somewhere that perl command line input is done only with ISO-8859-1 (the Latin 1 character encoding, known as codepage 1252 in windows), but this allows for only 191 characters, all of which are in the ASCII/Extended ASCII set. if this is the case and perl does not allow unicode characters as command line arguments, i can live with that. it just seems strange because i only read that from one source.

please let me know if you need any more information. your help is greatly appreciated.

Last edited by nadroj; 08-25-2008 at 11:28 PM.
 
Old 08-26-2008, 06:23 PM   #2
nadroj
Senior Member
 
Registered: Jan 2005
Location: Canada
Distribution: ubuntu
Posts: 2,539

Original Poster
Rep: Reputation: 58
update: i was able to give this a shot on opensuse. windows disappoints again: my problem is solved in linux with a two-line perl script. ive spent almost two weeks trying to get this to work on windows, and unfortunately that is the target. so this looks like it may be more of a windows-specific problem.

so, has anyone been able to receive unicode characters as command line arguments specifically on windows?

thanks
 
Old 08-29-2008, 03:37 AM   #3
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,261

Rep: Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028
These guys should have the answer: http://www.perlmonks.org/
 
Old 08-29-2008, 09:02 PM   #4
nadroj
Senior Member
 
Registered: Jan 2005
Location: Canada
Distribution: ubuntu
Posts: 2,539

Original Poster
Rep: Reputation: 58
first: thanks chris. unfortunately, i think ive read every page on the web that has the words 'perl' and 'utf8' in it. as stated earlier, this IS a windows console (cmd.exe) problem, as the same script works on linux with little to no terminal configuration. what i had to settle with was to use windows' "code page 1252", which includes the ISO standards 8859-1 and 8859-15, which cover most western and European languages. its no universal unicode, but its better than nothing.

though if anyone does ever find a solution, please post it. i will _always_ be interested, especially considering the ~3 weeks i spent researching this. thanks
 
Old 08-31-2008, 06:09 PM   #5
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,261

Rep: Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028
Did you go there? Its a q & A site for ANY Perl qn (any platform). Some of the guys who write Perl itself (& the books) hangout there...
 
Old 08-31-2008, 09:05 PM   #6
nadroj
Senior Member
 
Registered: Jan 2005
Location: Canada
Distribution: ubuntu
Posts: 2,539

Original Poster
Rep: Reputation: 58
ive searched there during my research, but i didnt go and post. i just made a post now, so ill let you know if anything turns up. thanks
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl command line argument noir911 Programming 8 01-15-2008 04:59 AM
Perl working at command line, but not in Apache nutthick Linux - Software 3 04-12-2006 07:02 AM
PERL script OK at command line, not in browser alvo Programming 4 12-19-2004 08:28 AM
java - how do you input from the command line zeviddalop Programming 4 11-10-2004 11:59 AM
perl command line option question afshin Programming 2 01-22-2003 08:16 PM


All times are GMT -5. The time now is 11:30 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration