Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
1/
I'm looking for some FREE and EASY-TO-USE (no recipes please) software to convert a batch of images into a pdf file. A plus would be to be able to adjust margins for printing. Another plus would be the possibility of reducing the size of the pdf file without sacrificing too much quality. It's for scanned documents, so any additional information on how to create a pdf version of a document via scanning would be appreciated.
2/
I'm also looking for FREE software to convert pdf files into .txt, .odt, .doc and/or .rtf files deleting all of the carriage returns present in the pdf file so as to facilitate editing and modification (I would then reconvert into a pdf document after proofreading and/or making changes).
I've had a look on the net and have found some software for a certain commercial operating system whose name I won't mention, but not a lot for Linux. Some require following recipes, which is of no use to me because the person doing the job is not an IT specialist.
However, the two things that interest me most are:
1/ Batch image conversion to pdf
2/ pdf to txt DELETING CARRIAGE RETURNS (except for changes of paragraph)
I don't agree with you:
Quote:
The thought occurs... then maybe they shouldn't be doing the job, if following directions is out of their scope... O_o
Information Technology is for everybody, including those who are not interested in following recipes, programming and tweaking. If there is no software out there that does this, perhaps it needs to be developed. Programmers need to put themselves in the shoes of users. So often, overzealous programmers develop really wicked software which is unusable from a typical user's point of view, which is a pity really, because their talent goes to waste. However, that's not the point of this thread, I'm getting a little carried away here...
So, to sum up, it's the answers to the two points above that I'm looking for, and the former interests me more. The second point is not as important. One could simply cut and paste to do pdf->txt or use OpenOffice to do txt->pdf. So if it doesn't eliminate the carriage returns, which are present at the end of every single line in a pdf file, it's of no interest to anybody really.
I've had a look on the net and have found some software for a certain commercial operating system whose name I won't mention, but not a lot for Linux. Some require following recipes, which is of no use to me because the person doing the job is not an IT specialist.
These sort of people can often be replaced with a very short shell script.
I think with a little bit of effort you could write a script that would automate this process. Ie you can write a script to "follow a recipe", then the "person doing the job" would not actually have to do anything more than "click a button" that runs your script.
[quote]
There MUST be software that does job 1/ (batch images to pdf).
There are a huge number of command line tools for image file conversion.
This is the sort of script I was talking about for 1. Please note that it is completely untested
Code:
#!/bin/bash
outfs=""
for inf in *.jpg ; do
outf=${inf%jpg}pdf
outfs="${outfs} ${outf}"
convert $inf $outf
done
pdftk ${outfs} cat output all.pdf
rm -f ${outfs}
Basically it just converts each jpg file in the current directory into a pdf, and then joins all the pdfs into one multipage pdf.
For 2. you've already been told what commands you can use. I'm not exactly sure when you care about the carriage returns, but you can use commands like dos2unix for that.
#!/bin/bash
outfs=""
for inf in *.jpg ; do
outf=${inf%jpg}pdf
outfs="${outfs} ${outf}"
convert $inf $outf
done
pdftk ${outfs} cat output all.pdf
rm -f ${outfs}
Intriguing.
I've done a bit of C, Pascal, C++, C#, Java, php, etc. but not a lot of bash, hardly any at all.
I've installed pdftk. Looks like a cool tool.
How about "convert" ?!
I assume your code is pseudo-code.
I know what "rm -f" does, it force deletes files, in this case, I imagine it's the file name stored in the outfs variable (which stands for out-file-s...? Plural? An array? No longer needed because it's all in all.pdf now? ... The other singular (outf)? Simple variable? A file name?)... What does "inf" stand for? Do you go through an entire directory and only allow files that end with .jpg into the for loop? I don't understand the "for" loop.
"cat" concatenates files, and I assume "output" is an abbreviation for ... Not sure...
"inf" is a variable name? An array? ... Not sure what the percent sign does...
I guess I'll have to do a few bash script tutorials...
Thanks for the inspiring tip!
Last edited by geeeeky.girl; 12-23-2009 at 06:13 PM.
I don't see anything about inserting an image in a pdf file, and I don't understand how to do the "convert" part in your script. I might be that I don't understand the pseudo-code, but I get the feeling that something essential is missing.
I don't see anything about inserting an image in a pdf file, and I don't understand how to do the "convert" part in your script. I might be that I don't understand the pseudo-code, but I get the feeling that something essential is missing.
How do you do the "convert" part?
The "convert" command is part of the imagemagick package. It can convert one image format to another image format, where it determines the output type by the file extension. Basically each iteration of the loop is calling "convert file1.jpg file1.pdf" and building a lists of all the created pdf files. Finally at the end all the single page pdf files are joined together into a multipage file, and the single page files are deleted. The script is more than just psudo code. It's untested, but I suspect it has a good chance of working without any modifications.
What the issue always come across to me is the conversion of PDF to doc.
There are some free online conversion site such as http://www.zamzar.com but they sometimes let me wait a long time to get the converted files and it does not support encrypted files conversion. And the conversion quality is far from satisfying when PDF files with complicated layout.
So I always use desktop application AnyBizSoft PDF to Word Converter. From my long time experience of searching and testing, this tool supports encrypted files, batch conversion and preserves text, layouts, images and hyperlinks well.
You can have a try on them and choose your prefer one.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.