LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-27-2008, 05:27 AM   #1
alix123
Member
 
Registered: Nov 2004
Posts: 63

Rep: Reputation: 15
Question Perl File handling issue how to handle differnet character set


I have file which has the following content when i open the file with notepad. This file is in UCS-16 format. Howevever when I open this file with perl open (FILE,"c:\\file.txt") & read it into an array The array contains something else..(see below output when while read into an array) Basically I want to look for a text in this file but since file is encoded in UCS-16 format. i cannot do a string comparison.
ANY WAY in perl how to handle this .... Iam really not sure.... I'am able to view file with notepad .. but with perl open command output is really different.

#************* File when opened with notepad **************

L o a d e d s i g n a t u r e s : 4 5 0 0 8 1

S c a n n i n g f i l e C : \ v a u l t \ e i c a r c o m 2 . z i p
S c a n n i n g f i l e C : \ v a u l t \ e i c a r c o m 2 . z i p . c h k s u m
C : \ v a u l t \ e i c a r c o m 2 . z i p : : E i c a r - T e s t - S i g n a t u r e

Time taken in msecs: 1000

#************************content when opened with PERL OPEN Command

L\x00o\x00a\x00d\x00e\x00d\x00 \x00s\x00i\x00g\x00n\x00a\x00t\x00u\x00r\x00e\x00s\x00:\x00 \x004\x005\x000\x000\x008\x001\x00
\x00
S\x00c\x00a\x00n\x00n\x00i\x00n\x00g\x00 \x00f\x00i\x00l\x00e\x00 \x00C\x00:\x00\\x00v\x00a\x00u\x00l\x00t\x00\\x00e\x00i\x00c\x00a\x00r\x00c\x00o\x00m\x002\x00.\x00z \x00i\x00p\x00
\x00S\x00c\x00a\x00n\x00n\x00i\x00n\x00g\x00 \x00f\x00i\x00l\x00e\x00 \x00C\x00:\x00\\x00v\x00a\x00u\x00l\x00t\x00\\x00e\x00i\x00c\x00a\x00r\x00c\x00o\x00m\x002\x00.\x00z \x00i\x00p\x00.\x00c\x00h\x00k\x00s\x00u\x00m\x00
\x00C\x00S\x00A\x00C\x00L\x00A\x00M\x00U\x00T\x00I\x00L\x00:\x00C\x00:\x00\\x00v\x00a\x00u\x00l\x00t \x00\\x00e\x00i\x00c\x00a\x00r\x00c\x00o\x00m\x002\x00.\x00z\x00i\x00p\x00:\x00:\x00E\x00i\x00c\x00a \x00r\x00-\x00T\x00e\x00s\x00t\x00-\x00S\x00i\x00g\x00n\x00a\x00t\x00u\x00r\x00e\x00
\x00
Time taken in msecs: 1000

Last edited by alix123; 10-27-2008 at 06:20 AM.
 
Old 10-27-2008, 06:51 AM   #2
Telemachos
Member
 
Registered: May 2007
Distribution: Debian
Posts: 754

Rep: Reputation: 60
I'm not an expert in unicode handling, but I think that this may help:
Code:
open( my $fh, "<:encoding(UTF-16)", "c:\\file.txt" )
  || die( "Can't open file.txt: $!" );
You will probably need to add use Encode::Unicode; at the top of your script. See these two for more information:
http://perldoc.perl.org/perluniintro.html
http://perldoc.perl.org/Encode/Unicode.html
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl file handle arash8m Programming 3 03-07-2008 02:22 PM
Perl: replace character in file vippie Programming 4 03-23-2007 04:26 AM
Perl File Handle Problem barkers Linux - Software 5 02-10-2006 08:00 AM
perl file handle Xris718 Programming 7 06-06-2005 02:34 PM
Perl - Tpl file - Need to replace new line character. knnirmal Programming 2 09-07-2004 02:27 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:00 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration