Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
I use Vector Linux 6 (based on Slackware 12.1) and I set the encoding to UTF-8 because I use Spanish characters. The problem is that when I try to open files with accent marks, for instance, with xpdf they all look garbled. How can I get xpdf to display the names correctly? Thanks.
The usual is that there are substitute glyphs for the wanted characters, which would mean that the font that xpdf is using does not have the spanish characters in it. It is possible to have spanish fonts on your system and yet have programs that do not know enough to use them.
I think many PDF define their own fonts in the file, and it may be the fault of the PDF file.
I think most spanish chars are available in the Latin 1 charset. I also think most latin chars are two-byte chars in UTF8 encoding and only one in ASCII or Latin 1. It may be that xpdf doesn't understand utf-8 encoded chars. xpdf may have an INI setting or preference you can change or maybe there's a flag you can set somewhere to use utf-8 encoding. If not, you should probably set your encoding to something xpdf understands.
Sounds to me like xpdf can play nice with utf8 but that your file system --or whatever system translates a filename into something that appears on your screen -- might not. "é", is a 2-byte char when utf-8 encoded. It's a one-byte char in latin-1 so when these systems that don't understand utf-8 look at the filename, they think each é is two other chars.
The sensors program prints out a degrees symbol. On a console it prints some garbage character, but on an x console it prints a degree.
Man pages have had some strange character in them (for years) that does not print on consoles.
XPDF is displaying file names (open selection) using its own window and font.
For filenames, it is likely choosing one of the default fonts setup by the KDE system controls.
KDE has a control setup program in the main KDE menu.
Hunt down the KDE display properties and the fonts that are setup.
There are different fonts for different uses and various sizes.
Make sure KDE is setup with fonts that have all the characters you need.
If you use Gnome or any other window manager then same thing.
Still cannot find it ??
Write down the settings for all the KDE fonts.
Set them all to some weird easily recognizable font, different for each one.
See if any of the weird fonts show up in XPDF.
Still cannot find it ??
Look at the XPDF font closly and write down the usual characteristics.
Serif or San-serif, how the 'm' is made, the 'g', the 'j',
note the 'ae' spacing, and what glyph it displays for the special spanish characters.
Use the font selector program and go through all the fonts looking for an identical font.
If you find one then disable it.
When XPDF is forced to use a different font then you have found it.
If you have disabled all the candidates and cannot change XPDF then it must be using an internal font. Some programs do that, but it would be very strange for an x-window program. They cannot be fixed except by getting an updated program.
Get a copy of the XPDF source. Many distributions have them.
Go into the source and find the font used to display filenames.
Fixing it depends upon your programming skills and how badly it is built-in, and you may find something entirely different.
Try a different PDF viewer, there are more than one.
Addendum: Did strings on XPDF last night, and did NOT see any font names, but did see font function calls.
Looked at KDE fonts, and the file listing font looks like half of them. Was not motivated enough to mess up my own fonts trying to find out which was being used.
There is a KDE tool to look at all UTF characters. Check the spanish characters and see if they are one or two byte encodings.
I think all of ASCII and the latin extensions to it encode as one byte. UTF-8 only goes to 2 bytes (and more) for extension pages for
the eastern, african, asian, oriental, arabic, and other non-latin languages (and Klingon).
Last edited by selfprogrammed; 07-25-2010 at 04:24 PM.
Checked the KDE character map last night. Searched for some Spanish characters and found four. I was surprised to see that they are giving a UTF-8 encoding using two bytes, with one having three bytes.
They have UTF-16 values that are well under 256.
Having written a document on the coding of UTF-8, from what I remember, there are multiple ways to encode a particular character under UTF-8, with one being considered canonical. Unfortunately, some canonical systems differ.
It does not matter, the creator of the filenames decided which UTF-8 encoding was used and you are seeing one glyph or two glyphs.
My Character map showed the Spanish character glyphs, so my default fonts
in Slackware Linux 2.6.33 have those characters.
Going through the XPDF docs (/usr/docs/xpdf-*) there is a long Changlog file that lists many Unicode documents. They were very systematic about using their use of Unicode and their knowledge of it. But it is possible that they did not consider Unicode in filenames.
I am a little suspicious of that filename list. It looks as if they might be using some tool (from KDE or gtk) to do the open file. Many of the tools in KDE display similar boxes for open-file.
This does not help you much, but points out that if might not be XPDF that is messing up the filenames.
If you post some of the bad filenames then we could play with them too.
But we are unlikely to solve this without examining the XPDF source code.
A bug report to the XPDF support team might be in order, because they would know how they got the filenames displayed. See their /usr/docs for contact info.
I read the bug report. The substitute characters with ~ must be part of the Latin expansion applied to many character sets, which means that the glyphs are in the font, but it is using the wrong encoding to get to them.
It looks like trying to decode UTF-8 using IBM-Locale page, with double characters because it is not UTF-8 decoding the two bytes.
Probably could track down which IBM locale it is using, but it would not
be of much use. Changing your Locale would not help either, it is missing the UTF-8 decoding.
I don't think I can be of much more use, and it looks like a job for the xpdf dev team. Sorry. Disconnecting from thread.
Last edited by selfprogrammed; 07-31-2010 at 04:14 PM.