Korean Konundrum Number 2: Enforcing Multilingualism
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
By Andrew Holmes at 2007-01-14 12:59
While many computer users tend to use an interface requiring only their own native
language, moving out to another country (or, at least, to another language requirement)
creates problems, especially in cases where the user requires the ability to write in
their native language and another, or possibly two or three others. Here in Korea, I have
to write in not only English but also Korean (Hangeul) and Old (Traditional) and
Simplified (I resist the temptation to say simple) Chinese. Here is an account of
my ongoing battle to maintain this capability.
In moving to Korea (and finally building a new PC and getting online), I began to
encounter some problems. As mentioned previously, although the IME in Windows allows
multiple languages, their OS as "licensed" for use by East Asian users limits them to
their single interface language plus "support" for others. Quite aside from the
aforementioned troubles security-wise which arise from this, there is also the question
of what happens when you transition to another OS where the MS IME cannot be used (as it
is proprietary, after all).
In fact, Linux offers a number of options, of which a particular one will be mentioned
here. My interest in this article is how to produce written materials using two or more
languages, as in my job here I require at least English and Hangeul (native Korean
script). However, this is also of interest to others because any serious scholar of the
East Asian languages has to contend with the following:
1: Students of Korean really need to know as much Traditional Chinese as they possibly
can, because (a) many words in Korean and Japanese are not only derived from them, they
retain the original (or modified) Chinese characters, and (b) knowing these characters
across at least three languages means that you can understand meaning even if you
do not know how to pronounce the morpheme which corresponds to that character;
2: Students of Japanese need not only to be able to use original and derived Chinese
characters, but also katakana and hiragana;
3: Since the revolution in China in 1949, the government has been trying to "simplify"
the huge number of individual characters as an aid to literacy. This has led to a strange
situation in which non-Chinese users of Chinese characters use the "old" (Traditional)
character set whereas young Chinese have grown up with the new one. However, this does
not make the "old" set invalid. The Imperial Archives in China are full of millions of
documents written in "old" Chinese!
4: An additional point needs to be made regarding the needs of English speakers learning
Korean which is seldom mentioned: since the meanings of so many morphemes in Korean (and
Japanese, among others) are derived quite directly from Old Chinese, being able to
combine the two (if you have a previous exposure to Traditional Chinese, as I have) is
wonderfully helpful. However, although this would be my ultimate aim, my own current
keyboard(s) do not support this - I cannot find an appropriate "overlay" for Chinese IMs
such as the Bopomofu I learned in Taiwan, which is a shame because SCIM itself seems to
have no problem with this. You can actually buy stickers for this, but I think that would
deface my keyboard, irrespective of how practical it might prove to be.
To this it should be added that there are also other languages such as Vietnamese which
have imported a large number of words from Old Chinese and continue to use them to this
day; if you only look in a Vietnamese phrasebook, they are visible even when romanised.
Clearly, then, someone in my position needs to be able to render documents in the
appropriate scripts as and when. In my teaching work, this means that I need to be able
to write in English and Hangeul. The Hangeul script was (we are told) created by King Li
Sei Jong back in the fifteenth century, and prior to that, educated people used a form of
Old Chinese modified with Korean diacritics; it was only after about 1891 that the move
away from reliance upon (and knowledge of) Old Chinese characters towards a more
"patriotic" and global usage of Hangeul really began.
But there is also a linguistic point to be made here. Simplification leads to confusion.
According to some sources, there may be more than 60,000 historically distinct characters
in written Traditional Chinese and a process of "simplification" necessarily leads to a
situation in which a reduction in the absolute number of characters, but not of their
meanings, leads to an increase in the number of homophones which can really only be
distinguished on the basis of seeing their written counterparts. Traditional Chinese was
primarily a scholarly and literary language full of subtlety and meaning; the transition
from Old Chinese to Hangeul Korean (which has only about 44 characters) seems to me to
have had the same confusing effect.
Be this as it may, the Konundrum facing me was this: To find a substitute for the IME
usable with ease under Linux/KDE, allowing me ideally to mix Latin script with Hangeul,
hiragana, katakana and Chinese (Old or new, preferably Old). I have yet to succeed one
hundred per cent. in this regard, but I have made some progress.
There are a number of options but for my purposes I have settled on a system of software
which seems well established and comes as part of my Mandriva distro: SCIM.
SCIM stands for "Smart Common Input Method" (see http://www.scim-im.org/) and is the
ongoing result of a collaborative effort to bring an IME-style ease of multilingual text
input to POSIX-type systems, and it works extremely well with KDE and Mandriva, and I
have been using it over several of their consecutive distros over the last couple of
years. It is highly configurable, but there is a word of warning which needs to be made
before jumping into an install; it will work well with simple text processors (such as
Leafpad, which I use for basic e-mail and other text editing, such as in these articles,
although it was originally intended for coders rather than writers), but to use it with
more advanced word processor software such as AbiWord, KWord and OOo, it is necessary to
install multiple language support at OS installation - in other words, you select a
multiple-language support option and proceed from there.
For this reason, I had to re-install MDV2007.0 on both of my systems (homebuilt custom PC
and Averatec 6700 laptop) so that multilingual input and display were possible.
Previously I never had trouble, because I did this without thinking in preceding
installations, but this time I was "careful" and ended up making more work and
inconvenience for myself! Be warned! Plan well ahead for any anticipated
Why is this necessary? Because, when we install for any particular language preference,
the fonts used by each "preference" will not otherwise be installed. I come from England;
my preferred interface language is UK English (see previous article); but if I install
only UK English for the interface, although my small US-internationalised keyboard has no
problem (because I can have UK English system-wide and an internationalised US converted
keyboard with no problem), I cannot use Hangeul or Chinese or Japanese without difficulty
because the appropriate fonts are not installed by default. If you select UK English (or
any other English for that matter), no other language support will be installed unless
you specify it at the time of installation.
Also, each font actually contains most (if not all) characters for each language
(including diacritics and other symbols, if present), but accessing these can be
difficult and time-consuming without additional software like SCIM to speed things up.
Additionally, software such as AbiWord, KWord and OOo will not be able to detect
non-installed fonts - usually a sure sign of an incorrect (or otherwise only partly
So if we want to use additional scripts (i.e. languages, in this context), we must ensure
that we install the appropriate "language support", and this is one of the earliest
options requested at the time of OS installation; everything else follows automatically
from this - select the languages you wish to use before you install your chosen
OS; write it all down if you have to, just make sure you don't forget or you may have
to go through the whole process all over again! Aaaaaarrrrghhhh!!!!!
SCIM and SKIM can also be installed from the CLI. Enter (as root):
and you will be presented with several options. In the case of someone using Hangeul,
scim-hangul needs to be installed as well as scim-tables with scim itself. Don't forget -
if you do it this way or through a graphical installer such as MCC - to "updatedb" and
then "makewhatis". This is the absolute minimum that you should do after installing the
rpms. Generally speaking, rpm sorts out the dependencies at the time of installation but
you can never be too careful.
This writer made a decision, after the first few installs of Mandrake/Mandriva, not to use anything other than rpms for software installation in the future. *.tar
files work fine but . . . I think they don't fit in with everything else, and seem to be
a lot harder to remove, short of wiping the hard drive.
KDE also has a native interface for SCIM, which is a daemon accessible from the command
line interface, called SKIM, and this should also be selected; users of GNOME should be
aware that there are software switches allowing better desktop integration for both
systems and these should be selected by right-clicking on the SCIM tray icon and
left-clicking on "Configure":
SKIM is not added to the main menu upon installation so the Menu Editor should be used to
put a convenient link to the executable. This is especially necessary if you opt to add
"Quit" to the SKIM right-click menu!
Text/word-processing software can also be selected after choosing language preferences -
again, choose with care. Under KDE I use AbiWord, KOffice, OOo and (for things like
writing e-mails) Leafpad. But the latter is apparently of little use for printing written
documents even though a print facility is present, so for simple text processing with a
few other features, I usually use AbiWord. In both of these, right-clicking on the text
space and selecting "SCIM Input Method" from the available options brings up a small
toolbar. This can also be configured for "Stand-alone" or "Embedded" mode. An interesting
note: the default configuration in the SCIM Configuration Panel actually does not allow a
user to quit from this icon; this has to be enabled, after which the "Quit" option is
available when right-clicking.
Right-click on the text area of (in this case) AbiWord and select "Input Method":
Then choose "SCIM Input Method". The SKIM icon will be in the System Tray:
Here, the SKIM icon is bottom right.
After selecting the languages you want to use in the "Configure" dialogue, the SKIM icon
changes to a keyboard, visible at top left. Left-clicking brings up a menu of the
languages, and once your input language is selected, a small floating toolbar appears
which can be used to toggle between languages:
You can do this simply by clicking to toggle between e.g. English:
(Click where it says "En" and "Kr"). Clicking on the wrench icon brings up the
"Configure" menu. Note that once "Hangeul" is selected, a red Hangeul syllable, "Han",
appears in the System Tray to let you know.
This toolbar is useful as you can anchor it on the screen as and where you like, and this
is convenient for flipping between keyboards, as it stays on top and does not interfere
with typed input. There are a number of encoding methods which can be selected but for
most users, flipping between "English/Keyboard" and "Hangul" is very convenient; the only
real complication is the need to select which font you want as appropriate, and this is
where being able to select fonts which are part of the different language support
packages is so invaluable - a font with both nice Hangeul and a nice corresponding Latin
character set makes life so much simpler. Font selection is a function of your word
processor, not of SKIM. Be warned!
The Configuration Panel also allows the following important functions to be selected:
* Languages - you can select from a wide range such as Simplified and Traditional Chinese
(guess which one I prefer), Malayalam, several Indian languages, Korean, Japanese,
Amharic, Arabic, Russian, Vietnamese. These can be selected/deselected as required.
* Integration into the KDE or GNOME desktop, selectable from Configure > General SCIM >
Other. "kconfig" and "scim-panel-kde" should be selected for working under KDE; "simple"
or "socket" plus "scim-panel-gtk" for working under GNOME.
* Start automatically when KDE starts (SKIM).
* Left-to-right or right-to-left script.
SCIM/SKIM also has a handy selection table for the East Asian languages, allowing
individual characters to be selected, if desired:
Right-click on the SKIM icon and select "Input Pad". The window which comes up is tabbed
and allows access to Chinese, Korean and Japanese characters (hiragana, katakana) which
can then be clicked to transfer to the text:
Unfortunately, I have not been able to make SCIM/SKIM work as a direct IME with KWord or
OOo so far, but this is easy to circumvent at the moment, in the same way as I edit
e-mails offline. UTF-8 coding remains the same irrespective of the application used,
meaning that a paragraph or passage can be composed and edited in a simpler text
processor such as Leafpad and then copied and pasted as required. This may seem strange
compared to the likes of Word, where composition can be handled directly, but to be blunt
about it, this is really little different from editing anywhere else. Text can simply be
highlighted, copied and pasted across text-processing applications and remains in the
same code, displaying consistently across applications. This seems to me to be more of a
strength than a curse! It does seem that as long as the apppropriate language support is
present, there is no problem in this regard. Once again, remember this before
This, then, is the set-up that I have been installing on my Linux box over the last few
Mandrake/Mandriva distros. As always, my willingness to dig a bit has paid dividends in
the form of Linux-based alternatives which all help to create a rich text-editing
environment, the best part being that - for AbiWord, at least - compatibility with Word
2003 (which I also use each day) has so far proven to be complete, with no errors on
documents composed at home, saved in MSWord *.doc format in my Yahoo Documents folders,
then downloaded and printed at work.
If someone offers me a choice between the complexity of Word 2003 and the ease and
simplicity of use of AbiWord, the latter wins hands down, and with assistance from
SCIM/SKIM, it is very helpful indeed. There is no doubt in my mind that the hard work of the collaborators in these projects has bequeathed people in many countries a convenient
tool for multilingual publication and communication. When will people realise that it's
not what you do, it's the way that you do it, that overloading an app with
features is more of a curse than a blessing?
It's so much better, surely, to be versatile and (ideally) competent with a simple tool
than incompetent with a complex tool. There are too many complex tools in the world of
computing, so what I have here is a wonderful software solution.