Abiword converts PDF to Word easily if imperfectly; surprised Libreoffice won't
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Abiword converts PDF to Word easily if imperfectly; surprised Libreoffice won't
Of course I know Libreoffice Writer can convert its documents to PDF; you simply select Export to PDF. But I was hoping to do the opposite: I downloaded a PDF of a doctor's new-patient form before my appointment, and wanted to convert it to a Word document and edit it in Libreoffice. I researched this and found out that Abiword does the conversion easily. It's not perfect--the fonts and other formatting generally aren't there--but I can use it, and the provider and staff can read it and enter it into the computer. I'll settle for it because I don't like writing detailed answers on pre-made forms with a limited amount of space, such as I often face on medical history.
Great, I solved my issue; but could Libreoffice do it? If Abiword can, I guessed the superior Libreoffice can probably do it too. To my surprise, I found a seemingly "official" statement that no, it can't: https://ask.libreoffice.org/en/quest...o-a-word-file/ . Although that post is going on three years old. I imagine the Libreoffice designers simply don't want to incorporate whatever Abiword did, because the conversion doesn't meet their high standards: they would want their conversion to look exactly like the PDF, and Abiword's conversion is crude.
Last edited by newbiesforever; 02-07-2019 at 07:50 AM.
LO can convert a document CREATED in LO between document format and pdf. I have no problem converting a PDF using ABIWORD, then always handling it using LO forever after.
Interesting - I have attempted to replace Libreoffice with Abiword and Gnumeric for a couple of years but every time I try Abiword, it is horrible: the UI is black and flickers and is unusable. This is on both Linux and FreeBSD.
It also depends on what is in the PDF. The format PDF is a terminal stage format. Your document goes there while waiting either to go to the printer or the bit bucket. Trying to recover data from a PDF is a fool's errand.
tldr; Go get the original which was used to create the PDF and work with that.
LO can convert a document CREATED in LO between document format and pdf. I have no problem converting a PDF using ABIWORD, then always handling it using LO forever after.
I don't particularly like ABiword either, and this is the first useful purpose I've had for it.
I don't particularly like ABiword either, and this is the first useful purpose I've had for it.
Converting a PDF back into 'text' is *NEVER* going to work 100%, unless you just have a basic text-document, single column. Any formatting (dual columns, etc.), is going to throw off whatever you convert.
Personally, if you can't get a hold of the source that the PDF I'd use the pdftotext utility from the command line, and make peace with the fact you're not going to get good results. When I've had to do such things and the PDF's contained images, I'd extract the images from the PDF's first, and then get the text. Copy/paste the text into LibreOffice Write, shove in the images, and go from there. There just isn't a good way to do this with PDF's.
Converting a PDF back into 'text' is *NEVER* going to work 100%, unless you just have a basic text-document, single column. Any formatting (dual columns, etc.), is going to throw off whatever you convert.
Personally, if you can't get a hold of the source that the PDF I'd use the pdftotext utility from the command line, and make peace with the fact you're not going to get good results. When I've had to do such things and the PDF's contained images, I'd extract the images from the PDF's first, and then get the text. Copy/paste the text into LibreOffice Write, shove in the images, and go from there. There just isn't a good way to do this with PDF's.
Good advice, but I take one exception: if you are talking about a LO PDF file, LO leaves adequate clues in the metadata to do a (Near)prefect conversion back to LO Writer. If the PDF was created by anything else, it will lack that kind of metadata. Somethign may be able to read and convert it, but it may not look as you think it should. Always best to have the source.
Good advice, but I take one exception: if you are talking about a LO PDF file, LO leaves adequate clues in the metadata to do a (Near)prefect conversion back to LO Writer. If the PDF was created by anything else, it will lack that kind of metadata. Somethign may be able to read and convert it, but it may not look as you think it should. Always best to have the source.
Quite correct, and great observation. The PDF's I had to work did NOT have that metadata, so I had to improvise.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.