LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Excluding multiple directories (https://www.linuxquestions.org/questions/linux-newbie-8/excluding-multiple-directories-823005/)

flamingo_l 07-30-2010 05:22 AM

Excluding multiple directories
 
hi,

I want to exclude multiple directories given as input in grep command.

Can anybody help me in this.

The following will exclude only one directory 27thJuly

[CODE]find . \( -type d -name "27thJuly" -prune \) -o -print| grep -rnH "root" *[\CODE]

lugoteehalt 07-30-2010 06:20 AM

Quote:

Originally Posted by flamingo_l (Post 4049891)
I want to exclude multiple directories given as input in grep command.

Don't quite follow but 'grep -v something' will exclude every line with the word 'something' in it.

You may string any number of commands together with pipes: comand1|command2|command3|command4|etc..

grail 07-30-2010 07:18 AM

Maybe have a look at -regex. Also you used the wrong slash in your code block ... should be /code

theYinYeti 07-30-2010 07:33 AM

The example command you gave would in fact grep for the word “root” among the filenames output by find. I suppose you actually want to grep the files themselves. The correct syntax for your example command would therefore be:
Code:

find . \( -type d -name "27thJuly" -prune \) -o -print0 | xargs -0 grep -nH "root"
(although neither the parenthesis nor the -H are strictly required)

For multiple directories, any of these commands should be suitable:
Code:

find . -type d \( -name "dir1" -o -name "dir2" \) -prune -o print0 | xargs -0 grep -nH "root"
find . -type d -name "dir1" -prune -o -type d -name "dir2" -prune -o print0 | xargs -0 grep -nH "root"
find . -type d \( -name "dir1" -o -name "dir2" \) -prune -o grep -nH {} +

The latter will perform better if there are a lot of files; in this case, the -H option is necessary. In case of incompatibility with your version of find, you can replace the “+” with “\;”.
If you want to target very specific directories, not all directories that match the given names, you can play with -regex.

Yves.

flamingo_l 08-02-2010 06:48 AM

Hi,

i have used the below, but throwed error as shown

Code:

find . -type d \( -name "27thJuly" -o -name "30thJuly" \) -prune -o print0 |
xargs -0 grep -nH "abcd"
find: paths must precede expression: print0
Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|opt] [path...] [e
xpression]


flamingo_l 08-02-2010 07:18 AM

The command is working when am using without print0 as follows..

Code:

find . -type d \( -name "27thJuly" -o -name "30thJuly" \) -prune -o -print|xargs -0 grep -nH "abcd"
But it is not producing the correct output:
Suppose there are 3 directories in my current dir. Am excluding 2 directories from search, the other directory has multiple files but only 1 file contains the word "abcd", then it is showing are the files in that directory as shown below:

Code:


$  find . -type d \( -name "27thJuly" -o -name "30thJuly" \) -prune -o -print|xargs -0 grep -nH "abcd"
grep: .
./29thJuly
./29thJuly/RP_search.txt
./29thJuly/RP_search_client.sh
./29thJuly/sample1.txt
: No such file or directory

Output should be:

Code:

/29thJuly/sample1.txt

colucix 08-02-2010 07:37 AM

A correct way to exclude directories with -prune might be
Code:

find . \( -wholename ./27thJuly -o -wholename ./30thJuly \) -prune -o -type f -print0 | xargs -0 grep -l "abcd"
The -type f option excludes the directory names from the output and the -print0 option manages file names with spaces (as previously mentioned).

In alternative you can try with the -regex option, as suggested by grail above, to match the wanted (and only the wanted) files.

flamingo_l 08-03-2010 01:06 AM

hi Colucix, Thank you ... Its working...

In the result of the above grep i want to exclude line containing # i.e comments from the search result.

The below command is working , is there any other way to do this

Code:

find . \( -wholename ./27thJuly -o -wholename ./30thJuly \) -prune -o -type f -print0 | xargs -0 grep -l "abcd"|grep -v '#'

colucix 08-03-2010 02:19 AM

Well, if comments are preceded by one or more blank spaces or tabs, you may refine the regexp with something like:
Code:

... | grep -v '^[[:space:]]*#'
This matches (and exclude) any line containing 0 or any number of space characters followed by # at the beginning of the line itself (see the ^ anchor).

flamingo_l 08-03-2010 02:21 AM

I have written a script which takes the directories to be excluded as input.

Am running the script in different path, so am giving the directories to be excluded relative to the path which am running the script.

like for example --> ../../sample_files/EX1 and ../../sample_files/EX2

when i run the above command it is not excluding the directories :(

Can anybody please help me here?

flamingo_l 08-03-2010 02:28 AM

Thanks Colucix...

.

colucix 08-03-2010 03:45 AM

Quote:

Originally Posted by flamingo_l (Post 4053515)
Am running the script in different path, so am giving the directories to be excluded relative to the path which am running the script.

like for example --> ../../sample_files/EX1 and ../../sample_files/EX2

The -wholename predicate matches paths relative to the top level (search) directory as specified in the find command. In this case better to use absolute paths, e.g.
Code:

find /home/flamingo \( -wholename /home/flamingo/dir1 -o -wholename /home/flamingo/dir2 \) -prune -o -print
This works despite the current working directory from which you run the script. Just be sure to use an absolute search path (highlighted in blue in my example) in the find command line.

Moreover, if you want to easily transform the list of directories to their absolute paths, you may go to the directory where you run the script and do something like this:
Code:

while read line
do
  readlink -f "$line"
done < original_list > new_list

where I assume the input file is called "original_list". The readlink -f command gives the absolute path of any file/directory/link. Hope this helps.


All times are GMT -5. The time now is 05:16 AM.