LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   GREP: only for text files? What am I doing wrong? (https://www.linuxquestions.org/questions/linux-newbie-8/grep-only-for-text-files-what-am-i-doing-wrong-4175562417/)

Adams Seven 12-26-2015 04:50 PM

GREP: only for text files? What am I doing wrong?
 
I thought the command:

grep -a -i -r "Linux is" /media/hd/doc/computer/

would find the text string Linux is in all files in /media/hd/doc/computer/ and subdirectories.

But it's not! It finds Linux is only in text files. The -a switch isn't doing what I thought it would do. I know that Linux is is in several .doc and .odt files, but grep doesn't report them.

Am I doing something wrong? Using a utility for an unintended purpose? Can anyone suggest a better utility for hunting for text in .doc files?

berndbausch 12-26-2015 05:12 PM

Most likely, the string is not exactly stored in this way in a doc or odt file. Perhaps the blank is something else than an ASCII 32. You can check that with the od command or a binary editor.

The only recommendation I have is OpenOffice or LibreOffice. They might have non-GUI utilities.

syg00 12-26-2015 07:48 PM

Neither .doc nor .odt are text files - ergo you can't simply search them as text.
A simple online search should have informed you of this.

For odt, try unzip before the grep, for .doc look at catdoc.

Adams Seven 12-26-2015 08:12 PM

First, FWIW: the search string:

grep -a -i -r "Linux" /media/hd/doc/computer/

also comes up empty. Thank you, though, berndbausch, for suggesting that the string might be stored differently; I hadn't thought of that.

This page at serverfault told me I can grep my way through binary files with the -a switch:

https://serverfault.com/questions/32...look-like-text

Did I misunderstand something? (Wouldn't be the first time ...)

I'll experiment with catdoc, syg00; thank you for this recommendation.

Adams Seven 12-26-2015 09:10 PM

It looks like Recoll can do the job:

http://www.lesbonscomptes.com/recoll/


All times are GMT -5. The time now is 09:45 AM.