LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 02-16-2009, 08:45 AM   #1
m4rtin
Member
 
Registered: Sep 2007
Posts: 261

Rep: Reputation: 16
iconv us-ascii to UTF-8 or ISO-8859-15


Why isn't it possible to convert us-ascii or ASCII to UTF-8? Or am I doing something wrong?

Code:
root@martin-desktop:/home/martin/test# nano file1.txt
root@martin-desktop:/home/martin/test# file --mime file1.txt 
file1.txt: text/plain charset=us-ascii
root@martin-desktop:/home/martin/test# iconv -c -f ASCII -t UTF-8 file1.txt > file2.txt
root@martin-desktop:/home/martin/test# file --mime file2.txt 
file2.txt: text/plain charset=us-ascii
Even from us-ascii to ISO-8859-15 doesn't work:

Code:
root@martin-desktop:/home/martin/test# iconv -c -f ASCII -t ISO-8859-15 file1.txt > file3.txt
root@martin-desktop:/home/martin/test# file --mime file3.txt file3.txt: text/plain charset=us-ascii
What might be the problem?

Last edited by m4rtin; 02-16-2009 at 08:46 AM.
 
Old 02-16-2009, 09:06 AM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
I may be wrong, but I believe it's because the first set of encoding tables in UTF-8 and ISO-8859 are identical to ASCII. There's no need for the textfile to appear otherwise until non-ascii characters are introduced. Either that or file just can't tell the difference between them. As soon as you add a non-ascii character the file output should change.
 
Old 02-18-2009, 07:34 PM   #3
servat78
Member
 
Registered: Jan 2009
Posts: 100

Rep: Reputation: 17
The previous posting is correct. The ASCII encoding containing the 128 basic chars is exactly the same for the UTF-8. UTF-8 does it's tricks only for chars above the ASCII range. Technically an ASCII text file and an UTF-8 with the same contents are equivalent.

It would be a different case when converting ASCII to UTF-16, because UTF-16 uses 2-byte character code entries and the conversion would immediately double the file size.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to input non-utf characters from utf-8 linux enviroment? jadas Linux - General 6 02-07-2009 03:20 PM
Converting UTF-16 files to another encoding (such as UTF-8) crisostomo_enrico Solaris / OpenSolaris 3 03-25-2008 05:30 PM
im getting UTF-8 to STRING: Could not open converter from 'UTF-8' to 'ISO-8859-1' jabka Linux - Newbie 2 11-24-2006 05:44 AM
[Enter] in text documents diffrent on Windows and Linux? UTF-8/UTF-16 problem or? brynjarh Linux - General 1 11-24-2004 05:20 AM
X11 / UTF-8 locale seems missing 'fr_FR.UTF-8' chrsitophermann Debian 11 07-17-2004 02:04 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 06:09 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration