LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-06-2008, 09:01 PM   #1
jaepi
Member
 
Registered: Apr 2007
Location: Urban Jungle
Distribution: Ubuntu
Posts: 189
Blog Entries: 1

Rep: Reputation: 30
Help with unreadable comments


Hello there. I'm having some sort of a problem with my text editor. It cannot read Japanese text. Is there any way or any available software in Linux that could convert these text? It all appear as junk.

Last edited by jaepi; 01-06-2008 at 09:11 PM.
 
Old 01-07-2008, 12:37 AM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
It might help if you told us exactly which text editor you're using. Perhaps it can handle the text if set up correctly. Do you have Japanese fonts installed? What language environment are you using? It's all in the details.

Well anyway, you should be able to use iconv to translate the files into utf-8, which most of the newer editors should be able to handle. You do need to know the encoding of the original file though.

BTW, this is the non-*nix programming forum. I believe you meant to post to the regular programming forum, no?
 
Old 01-07-2008, 12:53 AM   #3
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
Often, you can use the "file" program to guess which encoding is used in a text file.

Code:
cat >testfile
Здравсвуийте
jschiwal@hpamd64:~> file testfile
testfile: UTF-8 Unicode text
However sometimes it will guess wrong. "iconv -l" will list the supported encoding schemes. Also look in wikipedia. They have a good article explaining different character encoding standards as well as charts of characters for variious standards.

Code:
cat testfile2
������������
jschiwal@hpamd64:~> iconv -f ISO-8859 -t UTF-8 testfile2 -o testfile3
iconv: conversion from `ISO-8859' is not supported
Try `iconv --help' or `iconv --usage' for more information.
jschiwal@hpamd64:~> iconv -f WINDOWS-1251 -t UTF-8 testfile2 -o testfile3
jschiwal@hpamd64:~> cat testfile3
Здравсвуийте
 
Old 01-07-2008, 01:22 AM   #4
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Japanese text is most-commonly encoded in SHIFT-JIS, with JIS (ISO-2022-JP) and EUC-JP also being possible. The last one is mostly found in web documents.
 
Old 01-07-2008, 06:23 AM   #5
knudfl
LQ 5k Club
 
Registered: Jan 2008
Location: Copenhagen DK
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,511

Rep: Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641
Quote:
Originally Posted by jaepi View Post
Hello there. I'm having some sort of a problem with my text editor. It cannot read Japanese text. Is there any way or any available software in Linux that could convert these text? It all appear as junk.
高鐵賣的好 散人潮

1925年高鐵賣的好 散人潮

1925年に創業し、80年を越える歴史を持つラックスマンは、プリアンプ、パワーアンプ、プリメインアンプ、真空管アンプ等の高級オーディオ製品の優れたブランドとして
The above is random example text from leafpad and kwrite. Some chinese and the '1925 text' is
japanese.
 
Old 01-07-2008, 07:46 PM   #6
jaepi
Member
 
Registered: Apr 2007
Location: Urban Jungle
Distribution: Ubuntu
Posts: 189

Original Poster
Blog Entries: 1

Rep: Reputation: 30
Quote:
Originally Posted by David the H. View Post
It might help if you told us exactly which text editor you're using. Perhaps it can handle the text if set up correctly. Do you have Japanese fonts installed? What language environment are you using? It's all in the details.

Well anyway, you should be able to use iconv to translate the files into utf-8, which most of the newer editors should be able to handle. You do need to know the encoding of the original file though.

BTW, this is the non-*nix programming forum. I believe you meant to post to the regular programming forum, no?
I'm using gedit in ubuntu..
 
Old 01-09-2008, 02:48 AM   #7
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Ok, apparently for gedit, you have to specify the encoding when you open the file. Start the program, then hit "file open". At the bottom of the file dialog should be a selector for character encoding, probably set to "auto detected". Click on that and use the "add or remove" dialog to add all the Japanese types to the menu. After that it should be able to auto-detect the encoding, or at least it will let you try all the options until you find one that works.

You can also launch it from the command line with a specific encoding using:

gedit --encoding=SHIFT_JIS textfile.txt

I think it's best though to convert and save all your text files to UTF-8 anyway (in addition to iconv, you can use the "save as" dialog in gedit to set the output encoding in the same way as above). It's the standard encoding of modern Linux distros and can handle all languages. The computing world is headed to unicode, and all these old encodings are slowly dying out.

Oh, and don't forget that you need to have a japanese-capable font installed and set in your preferences also.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Ejecting Unreadable CD kaixa Linux - General 6 05-19-2005 05:51 AM
Unreadable format praveenv Linux - Newbie 1 08-17-2004 04:02 AM
Resolution is unreadable ogden2k Linux - Newbie 3 04-04-2004 02:03 PM
cdrom unreadable not_yet Linux - Newbie 1 02-13-2004 01:20 AM
unreadable bootdisk lynch Linux - Software 4 12-14-2000 04:03 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:57 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration