LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   is there a way to grep thru *.pdf (https://www.linuxquestions.org/questions/linux-newbie-8/is-there-a-way-to-grep-thru-%2A-pdf-4175564031/)

atjurhs 01-15-2016 11:11 AM

is there a way to grep thru *.pdf
 
hey guys, I have a directory with 20+ pdf files and I need to find info within 1 or more of them. the info will contain a specific string, let's say aBcDeFg. I know

[code\] grep -isnr aBcDeFg *.pdf [code]

won't work. is there something that will grep thru a bunch of pdf files?

thanks!

tabby

schneidz 01-15-2016 11:13 AM

maybe strings can help ?

Tonus 01-15-2016 12:11 PM

is there a way to grep thru *.pdf
 
Could it be ok to convert them first to text ?

TB0ne 01-15-2016 12:15 PM

Quote:

Originally Posted by atjurhs (Post 5478719)
hey guys, I have a directory with 20+ pdf files and I need to find info within 1 or more of them. the info will contain a specific string, let's say aBcDeFg. I know

[code\] grep -isnr aBcDeFg *.pdf [code]

won't work. is there something that will grep thru a bunch of pdf files?

thanks!

tabby

First, the CODE tags are [ CODE] to start, and [/ CODE] to stop. :) As far as your issue goes, try pdftotext, which will convert that PDF file into text...which you can then pump through grep for a string. A simple loop:
Code:

for file in /pdf/path/*.pdf; do pdftotext "$file"; done
..will convert them all into text. Change the 'pdftotext' to be your grep:
Code:

for file in /pdf/path/*.txt; do grep aBcDeFg "$file"; done

atjurhs 01-15-2016 12:50 PM

TBOne thanks for your help! i'll give it a try...

and thanks for the code tags, I always forget them, i'll put them on a sticky...

tabby

atjurhs 01-15-2016 01:12 PM

yep, that worked, took a little while to convert them, thanks

tabby


All times are GMT -5. The time now is 10:06 AM.