[SOLVED] Looking for a string in a set of PDF files.

stf92 · 03-09-2014, 08:16 PM

Hi: Is it possible (I mean, not too complicated).

frankbell · 03-09-2014, 08:39 PM

You can install and use pdfgrep. I just tested it. For Debian, it's in the repos. There doesn't seem to be a SlackBuild, but it's on sourceforge: http://pdfgrep.sourceforge.net/

qweasd · 03-09-2014, 09:22 PM

Also, you could use pdfunite to make them one big PDF.

stf92 · 03-09-2014, 10:34 PM

Quote:

Originally Posted by frankbell

You can install and use pdfgrep. I just tested it. For Debian, it's in the repos. There doesn't seem to be a SlackBuild, but it's on sourceforge: http://pdfgrep.sourceforge.net/

Thank you very much. A pity it can't be used for PDFs that are photographic copies, but neither can the PDF reader.

willysr · 03-09-2014, 11:03 PM

i have submitted pdfgrep SlackBuild in SBo and waiting for approval

metaschima · 03-10-2014, 11:43 AM

Quote:

Originally Posted by stf92

Thank you very much. A pity it can't be used for PDFs that are photographic copies, but neither can the PDF reader.

You would probably need to use an OCR program on it, hope that it works well enough to get words without misspellings (unlikely), and then grep them.

Toutatis · 03-10-2014, 12:41 PM

You can use also pdftotext which is already in slackware, and then grep.