LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices

Reply
 
Search this Thread
Old 03-09-2014, 08:16 PM   #1
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 3,125

Rep: Reputation: 46
Looking for a string in a set of PDF files.


Hi: Is it possible (I mean, not too complicated).
 
Old 03-09-2014, 08:39 PM   #2
frankbell
Guru
 
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Mageia, Mint
Posts: 7,731

Rep: Reputation: 1459Reputation: 1459Reputation: 1459Reputation: 1459Reputation: 1459Reputation: 1459Reputation: 1459Reputation: 1459Reputation: 1459Reputation: 1459
You can install and use pdfgrep. I just tested it. For Debian, it's in the repos. There doesn't seem to be a SlackBuild, but it's on sourceforge: http://pdfgrep.sourceforge.net/
 
2 members found this post helpful.
Old 03-09-2014, 09:22 PM   #3
qweasd
Member
 
Registered: May 2010
Posts: 439

Rep: Reputation: Disabled
Also, you could use pdfunite to make them one big PDF.
 
Old 03-09-2014, 10:34 PM   #4
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 3,125

Original Poster
Rep: Reputation: 46
Quote:
Originally Posted by frankbell View Post
You can install and use pdfgrep. I just tested it. For Debian, it's in the repos. There doesn't seem to be a SlackBuild, but it's on sourceforge: http://pdfgrep.sourceforge.net/
Thank you very much. A pity it can't be used for PDFs that are photographic copies, but neither can the PDF reader.
 
Old 03-09-2014, 11:03 PM   #5
willysr
Senior Member
 
Registered: Jul 2004
Location: Jogja, Indonesia
Distribution: Slackware-Current
Posts: 2,553

Rep: Reputation: 424Reputation: 424Reputation: 424Reputation: 424Reputation: 424
i have submitted pdfgrep SlackBuild in SBo and waiting for approval
 
3 members found this post helpful.
Old 03-10-2014, 11:43 AM   #6
metaschima
Senior Member
 
Registered: Dec 2013
Distribution: Slackware
Posts: 1,183

Rep: Reputation: Disabled
Quote:
Originally Posted by stf92 View Post
Thank you very much. A pity it can't be used for PDFs that are photographic copies, but neither can the PDF reader.
You would probably need to use an OCR program on it, hope that it works well enough to get words without misspellings (unlikely), and then grep them.
 
Old 03-10-2014, 12:41 PM   #7
Toutatis
Member
 
Registered: Feb 2013
Posts: 31

Rep: Reputation: Disabled
You can use also pdftotext which is already in slackware, and then grep.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] merge pdf files with each file as a index entry in the big pdf ununun Linux - General 3 05-12-2014 10:32 AM
How to search contents of multiple pdf files and return the pdf's file name? Hoxygen232 Linux - Newbie 4 04-28-2013 09:39 AM
LXer: How to convert PDF files to HTML or XML files in openSUSE LXer Syndicated Linux News 0 08-20-2008 08:40 AM
output the path for files whose names include string "string" (case insensitive) sean_zhang Linux - Newbie 1 03-04-2008 11:59 PM
How do I unpack pdf.pdf files corbis_demon Linux - General 5 10-29-2004 09:12 PM


All times are GMT -5. The time now is 03:30 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration