LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 12-10-2007, 06:49 AM   #1
markraem
Member
 
Registered: Nov 2003
Posts: 82

Rep: Reputation: 15
find a string in all ASCII files of a system


Hi,

I know that
find . -type f -exec grep -il 'string1' {} \;

will look for string1 in all regular files.
However, it will also look in binary files, which takes to much time.

I only want the command to look in ASCII / TEXT files only.

I find a lot of examples like
find . -name "*.txt* -exec grep ...

but this command restricts the search to *.txt files only, but there are can be other ASCII files who do not have the the .txt extension and can also contain the info i am looking for/


I try to include the file command, as this command tells me the type of a file, but I am struggling to have this command integrated into the find statement.

can anybody help me ?


the idea is lauch the entire query from / so that I can look for parameters in configfiles without noing what configfiles are used. This helps exploring a linux distro a lot.

Last edited by markraem; 12-10-2007 at 06:50 AM. Reason: type
 
Old 12-10-2007, 07:38 AM   #2
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Arch/XFCE
Posts: 17,802

Rep: Reputation: 728Reputation: 728Reputation: 728Reputation: 728Reputation: 728Reputation: 728Reputation: 728
You can integrate command#1 into command#2 if #1 produces exactly what #2 is looking for. In this case, I think you will need to actually write a small script.

BUT: config files are typically only in certain directories--eg /etc, $HOME, and few others. Thus is seems more efficient to simply restrict the search to those directories
 
Old 12-10-2007, 07:49 AM   #3
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 655Reputation: 655Reputation: 655Reputation: 655Reputation: 655Reputation: 655
I agree that you need a better idea where to look.

You could us the "file" command. It will look only in the beginning of a file. So you would pipe the output of "find" to "file" and the use sed to filter out the added info from each line found. Then you can use xargs to process that list of files. You will probably want to use the -print0 in the find command and -0 in the xargs command. Also because the list will be so long, use one of the "xarg" arguments to limit how much is processes at once.

Besides having an idea where to look, there are some directories where you don't want to look such as /sys, /proc, /mnt/.
 
Old 12-10-2007, 08:15 AM   #4
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 63
A few points:
  1. Linux does not enforce file name extensions as file-type identifiers. Searching for files ending in .txt will probably find you a list of some text files.
  2. using find's -exec option works fine, but it is very inefficient for large numbers of files. This is because the specified command is invoked once per found file. If you have thousands of files, that means thousands of invocations of grep or whatever else you are running.
    Since grep can search through many files at once if they are listed one after the other on the command line, a better approach is to use xargs. xargs read a command and list of strings, and executes command for groups of the listed strings.
    Consider this example:
    Code:
    $ seq 1 10 |xargs -n 3 echo 
    1 2 3
    4 5 6
    7 8 9
    10
    seq just prints number 1 to 10. You can see that xargs is grouping them into threes and appending hem a arguments to echo. You can do the same thing with find:
    Code:
    find / -type f |xargs grep -l "string1"
    However, there is a potential problem... If a file name contains a space, grep will treat it as two separate file names (because the space character is an argument de-limiter). You can protect against this by asking find to delimit its output using an ASCII NUL by using the -print0 option to find and you can tell xargs to expect this using the -0 option. It makes the command a little longer by a lot more robust:
    Code:
    find / -type f -print0 |xargs -0 grep -l "string1"
  3. Linux's filesystem is arranged in a way which groups files by type. This provides a mechanism for avoiding large binary files which you don't want to search. For example, all user files should be in /home somewhere. This is where all your documents and photos and so on should reside. So if you want to search only your work, and not through the files which are installed as part of software packages, you can start your search here:
    Code:
    find /home -type f -print0 |xargs -0 grep -l "string1"
    This will save a lot of time. You might also want to avoid anything inside any directory named bin. There are several ways to do this. You can use grep to remove any paths with /bin/ in them for example which will stop the search from bothering with any user-installed programs (which are often placed in $HOME/bin). Remember we're using the NUL delimited output from find. grep takes then option -z to understand this:
    Code:
    find /home -type f -print0 |grep -z -v /bin/ | xargs -0 grep -l "string1"
 
Old 12-12-2007, 04:41 AM   #5
markraem
Member
 
Registered: Nov 2003
Posts: 82

Original Poster
Rep: Reputation: 15
Thank you Matthewg42.

Your solution is indeed a quick and was the one I am looking for.
 
Old 12-12-2007, 05:08 AM   #6
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
You may also consider the option -I of grep, which is equivalent to the long option
--binary-files=without-match. This will process only the first bytes of a file, just to assume it is a binary and then ignore it.
 
  


Reply

Tags
filetype, find, grep, xargs


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
python: converting a 3 character octal string into and ASCII char llama_meme Programming 1 07-06-2010 03:00 PM
How to find those files having specific string??? saeed Red Hat 4 07-19-2006 12:50 PM
Can I find links in the system that point to certain files throughout the system? HGeneAnthony Linux - Newbie 3 02-18-2005 09:28 AM
string to ASCII and back... skora Programming 2 11-24-2003 05:05 AM
Find string pattern in directory of text files magnum818 Linux - Newbie 2 10-15-2003 09:19 PM


All times are GMT -5. The time now is 04:18 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration