What is wrong with pdf files created by Ghostscript?
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
What is wrong with pdf files created by Ghostscript?
I installed ghostscript on a unix server, and use it to create pdf files for windows clients ('pdf printer').
When I send true type fonts as outlines to Ghostscript, the pdf document created looks great.
However, the text search tool of Acrobat Reader does not work in these pdf files: never finds any text that is apparently in the document.
Likewise, the text extract tool of Acrobat Reader does not seem to work either: it extracts only unreadable garbage.
This is an old problem of mine, since I had the same problem 3 years ago with the windows version of Ghostscript.
What causes the problem and is there a workaround?
When using the text extract tool of Acrobat Reader, I found that:
- the extracted, unreadable text has the same length as the original one,
- each character is "mapped" to an other ASCII character like this: if I extract a certain (constant) number from the ASCII code of the extracted character, I get the ASCII code of the original character.
I think this proves that Ghostscript creates the fonts, only these fonts have different ASCII code mappings, and that is why text search does not work.
The above is based on my finding with the Windows version three years ago, but the unix version produces very similar output, so I think the shift in the ASCII code causes the problem there, too.
But why?
P.S.
Using the printer fonts instead of sending the TrueType ones produces unacceptable output: incorrect accented characters and overlapping characters.
Sending TrueType as bitmap produces very ugly documents (rough pixels), and the text extract tool of AcroRead will not work either.
Only Open Office 1.1 creates correct pdf files, but I need a server solution, for which Ghostscript would be ideal.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.