LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 06-22-2017, 02:25 PM   #1
Xeratul
Senior Member
 
Registered: Jun 2006
Location: UNIX
Distribution: FreeBSD
Posts: 2,657

Rep: Reputation: 255Reputation: 255Reputation: 255
Convert a PDF in Czech language with right characters with PDFTOTEXT?


Hello,

I did try to convert the book Romance pro křídlovku

Code:
pdftotext romance.pdf > romance.txt
there is however the all characters which are not working.

Code:
locale -a
C
C.UTF-8
POSIX
would you know to do such complicated operation?
 
Old 06-22-2017, 06:18 PM   #2
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-15.0
Posts: 11,058

Rep: Reputation: Disabled
Code:
man pdftotext
See the -listenc and -enc options.
 
Old 06-22-2017, 11:20 PM   #3
Xeratul
Senior Member
 
Registered: Jun 2006
Location: UNIX
Distribution: FreeBSD
Posts: 2,657

Original Poster
Rep: Reputation: 255Reputation: 255Reputation: 255
Quote:
Originally Posted by Didier Spaier View Post
Code:
man pdftotext
See the -listenc and -enc options.
ISO 8859-2 is not there

I could convert it with Chromium + Staroffice2 to DOC, which is capable to keep the right encoding.

no way with open programmes.

Last edited by Xeratul; 06-23-2017 at 12:03 AM.
 
Old 06-22-2017, 11:39 PM   #4
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-15.0
Posts: 11,058

Rep: Reputation: Disabled
Why not use UTF-8, which is the default?
 
Old 06-23-2017, 12:06 AM   #5
Xeratul
Senior Member
 
Registered: Jun 2006
Location: UNIX
Distribution: FreeBSD
Posts: 2,657

Original Poster
Rep: Reputation: 255Reputation: 255Reputation: 255
Quote:
Originally Posted by Didier Spaier View Post
Why not use UTF-8, which is the default?
Already up? Deja ton café?

C.UTF-8 is for me ok, but I just would like to convert it and view it on pocket book, that's why.
I did add the encoding with dpgk-reconf locales. Anyhow, all time, I use UTF as default.
DOC worked, but well, TXT could be a cool learning experience.

No way so far for txt.


Tu t'y connais en PHP?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF to XML. But how do I convert xmlpdf back to another pdf? dedec0 Linux - Software 6 05-01-2017 06:22 PM
[SOLVED] pdftotext 0.20.2: pdftotext *.pdf gives syntax error. stf92 Slackware 6 01-22-2015 01:34 PM
Czech Language installation image for Linux Deepin 2013 is available cxbii Linux Deepin 3 01-26-2014 04:08 AM
Accented Characters and other "foreign language" Characters Mark_in_Hollywood LQ Suggestions & Feedback 2 04-30-2007 06:10 PM
Convert pdf to html or txt or remaster the pdf? jago25_98 Linux - Software 1 12-13-2005 01:11 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 12:06 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration