LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 10-07-2004, 09:53 AM   #1
bran
LQ Newbie
 
Registered: Oct 2004
Posts: 2

Rep: Reputation: 0
searching content of pdf documents


Hi there

I was wondering if anyone knows a software that allows me to search for keywords in different pdf, openoffice and text documents simultaneously. In other words, something like a google for my local disk content.

Thanks in advance,

bran
 
Old 10-08-2004, 12:41 AM   #2
kaise_sose
LQ Newbie
 
Registered: May 2004
Location: Wonderful land of OZ
Distribution: Gentoo
Posts: 23

Rep: Reputation: 15
you can use the grep command

grep [options] "thing you are looking for" "file(s) to look in"

eg

> grep -r "asdf" /home/

would look for the string "asdf" in /home/ and all the subdir's

AFAIK this will only work on text files. (well it will read all the files but if they aren't text it will read garbage)
I don't think it will work on pdf's or other filetypes tho.

It might still extract info out of openoffice docs because the text is still in there somewhere just has other formatting crap in the file too (which shouldn't match any normal search anyway).
 
Old 10-08-2004, 06:36 AM   #3
maroonbaboon
Senior Member
 
Registered: Aug 2003
Location: Sydney
Distribution: debian
Posts: 1,495

Rep: Reputation: 48
For PDF there is a tool called 'pdftotext' which extracts the text from a PDF file. Then you can use grep as already described, e.g.

pdftotext somefile.pdf - | grep -i someword

Not sure what you can do with an OpenOffice file.
 
Old 10-08-2004, 07:35 AM   #4
justin_p
Member
 
Registered: Jan 2004
Location: Virginia, USA
Distribution: slack 13; I've used it all :)
Posts: 433

Rep: Reputation: 30
the problem with pdf's is that they are oftern scanned in and you can't really mess with them. I'll have to check out the pdftotext thing.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How can I make PDF documents smaller (In file size)? vdemuth Linux - Software 7 11-05-2011 09:39 AM
Searching text files by content will103 Linux - Software 1 01-24-2005 07:43 AM
cups not printing pdf documents javeree Linux - Software 0 12-07-2004 05:55 PM
pdf documents with wrong mimetype captainfreedom Linux - Software 0 11-13-2004 05:05 PM
converting documents to pdf Chijtska Linux - General 5 02-05-2002 05:30 PM


All times are GMT -5. The time now is 03:08 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration