ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hello there. I'm having some sort of a problem with my text editor. It cannot read Japanese text. Is there any way or any available software in Linux that could convert these text? It all appear as junk.
It might help if you told us exactly which text editor you're using. Perhaps it can handle the text if set up correctly. Do you have Japanese fonts installed? What language environment are you using? It's all in the details.
Well anyway, you should be able to use iconv to translate the files into utf-8, which most of the newer editors should be able to handle. You do need to know the encoding of the original file though.
BTW, this is the non-*nix programming forum. I believe you meant to post to the regular programming forum, no?
Often, you can use the "file" program to guess which encoding is used in a text file.
Code:
cat >testfile
Здравсвуийте
jschiwal@hpamd64:~> file testfile
testfile: UTF-8 Unicode text
However sometimes it will guess wrong. "iconv -l" will list the supported encoding schemes. Also look in wikipedia. They have a good article explaining different character encoding standards as well as charts of characters for variious standards.
Code:
cat testfile2
������������
jschiwal@hpamd64:~> iconv -f ISO-8859 -t UTF-8 testfile2 -o testfile3
iconv: conversion from `ISO-8859' is not supported
Try `iconv --help' or `iconv --usage' for more information.
jschiwal@hpamd64:~> iconv -f WINDOWS-1251 -t UTF-8 testfile2 -o testfile3
jschiwal@hpamd64:~> cat testfile3
Здравсвуийте
Japanese text is most-commonly encoded in SHIFT-JIS, with JIS (ISO-2022-JP) and EUC-JP also being possible. The last one is mostly found in web documents.
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,511
Rep:
Quote:
Originally Posted by jaepi
Hello there. I'm having some sort of a problem with my text editor. It cannot read Japanese text. Is there any way or any available software in Linux that could convert these text? It all appear as junk.
高鐵賣的好 散人潮
1925年高鐵賣的好 散人潮
1925年に創業し、80年を越える歴史を持つラックスマンは、プリアンプ、パワーアンプ、プリメインアンプ、真空管アンプ等の高級オーディオ製品の優れたブランドとして
The above is random example text from leafpad and kwrite. Some chinese and the '1925 text' is
japanese.
It might help if you told us exactly which text editor you're using. Perhaps it can handle the text if set up correctly. Do you have Japanese fonts installed? What language environment are you using? It's all in the details.
Well anyway, you should be able to use iconv to translate the files into utf-8, which most of the newer editors should be able to handle. You do need to know the encoding of the original file though.
BTW, this is the non-*nix programming forum. I believe you meant to post to the regular programming forum, no?
Ok, apparently for gedit, you have to specify the encoding when you open the file. Start the program, then hit "file open". At the bottom of the file dialog should be a selector for character encoding, probably set to "auto detected". Click on that and use the "add or remove" dialog to add all the Japanese types to the menu. After that it should be able to auto-detect the encoding, or at least it will let you try all the options until you find one that works.
You can also launch it from the command line with a specific encoding using:
gedit --encoding=SHIFT_JIS textfile.txt
I think it's best though to convert and save all your text files to UTF-8 anyway (in addition to iconv, you can use the "save as" dialog in gedit to set the output encoding in the same way as above). It's the standard encoding of modern Linux distros and can handle all languages. The computing world is headed to unicode, and all these old encodings are slowly dying out.
Oh, and don't forget that you need to have a japanese-capable font installed and set in your preferences also.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.