LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-21-2008, 04:39 AM   #1
hessodreamy
LQ Newbie
 
Registered: May 2008
Posts: 3

Rep: Reputation: 0
Delete files containing matching text


How to I delete files containing particular text, within a certain directory?

I've figured out how to find them using find . -exec grep -n search_term /dev/null {} \;

but how to delete them?
 
Old 05-21-2008, 05:24 AM   #2
bathory
LQ Guru
 
Registered: Jun 2004
Location: Piraeus
Distribution: Slackware
Posts: 11,707

Rep: Reputation: 1583Reputation: 1583Reputation: 1583Reputation: 1583Reputation: 1583Reputation: 1583Reputation: 1583Reputation: 1583Reputation: 1583Reputation: 1583Reputation: 1583
You can use this:
Code:
find . -exec grep -l search_term {} \;| xargs rm
 
Old 05-21-2008, 05:32 AM   #3
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
You're near the solution. Sincerely I'd keep it simple using a for loop in a script, but if you want a one-liner...
Code:
find . -type f -exec grep -q search_term {} \; -exec echo rm {} \;
The first -exec runs grep in quiet mode, resulting in the exit status (0 success, 1 fail). The second -exec command is executed based on the exit status of the preceeding command. In other words, you tell to find recursively all files in the current directory and if they match the pattern, remove them.

I put an echo before the rm command, for testing purposes. Check the results and when you are satisfied, strip out the echo and the files will be actually removed.
 
Old 05-21-2008, 05:40 AM   #4
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 671Reputation: 671Reputation: 671Reputation: 671Reputation: 671Reputation: 671
You don't really have a reason for using the find command unless you add a file type test:
find . -type f

If the files have a certain extension, you could use that in a -name argument:

find ./ -type f -name "*.txt" -exec grep -l '<pattern>' '{}' \; | xargs rm

---

If you don't need these tests, then you could simply use:
grep -l '<pattern>' *.txt | xargs rm

If the filenames might have whitespaces in them:

grep -l '<pattern>' *.txt | tr '\n' '\0' | xargs -0 rm

You could also use find to pipe the filenames to grep using xargs again.
find ./ -maxdepth 1 -type f -name "*.txt" -print0 | xargs -0 grep -l '<pattern>' | tr '\n' '\0' xargs rm

Note the use of "tr '\n' '\0'" to convert the arguments so that they are null separated. This simple trick even allows you to use the output of "ls ." with xargs when filenames might contain whitespace characters. This can be less clumsy then changing IFS.
 
Old 05-21-2008, 05:52 AM   #5
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
Quote:
Originally Posted by jschiwal View Post
If the filenames might have whitespaces in them:

grep -l '<pattern>' *.txt | tr '\n' '\0' | xargs -0 rm
The same using the -Z option
Code:
grep -Z -l '<pattern>' *.txt | xargs -0 echo rm
 
Old 05-21-2008, 06:01 AM   #6
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 671Reputation: 671Reputation: 671Reputation: 671Reputation: 671Reputation: 671
Thanks, I hadn't noticed the -Z option to grep. Maybe I should RTFM more often.
 
Old 05-21-2008, 06:29 AM   #7
hessodreamy
LQ Newbie
 
Registered: May 2008
Posts: 3

Original Poster
Rep: Reputation: 0
Thanks for the info. a couple of points:
1. There's about 50,000 files in the directory in question. This appraoch of passing a list of filenames to rm, would this handle hundreds or possibly thousands of arguments?
2. How can I get a match if the file contains one of a selected of words? I tried (using bathory's code)
Code:
find . -exec grep -l "(search_term1)|(search_term2)" {} \;| xargs rm
which I thought might work, but no. Any suggestions?
 
Old 05-21-2008, 06:33 AM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 15,418

Rep: Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013Reputation: 2013
egrep ....
 
Old 05-21-2008, 08:52 AM   #9
hessodreamy
LQ Newbie
 
Registered: May 2008
Posts: 3

Original Poster
Rep: Reputation: 0
yeah egrep (or grep -E) interprets the regexp properly. But what about the fact that there might be a couple thousand files matching? I don't want my server to melt!
Code:
egrep -Z -l  '(search_term1)|(search_term2)' s* | xargs -0 echo rm
 
Old 05-21-2008, 09:35 AM   #10
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
A couple thousands... is not a huge number!
 
Old 05-21-2008, 06:51 PM   #11
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.9, Centos 7.3
Posts: 17,347

Rep: Reputation: 2365Reputation: 2365Reputation: 2365Reputation: 2365Reputation: 2365Reputation: 2365Reputation: 2365Reputation: 2365Reputation: 2365Reputation: 2365Reputation: 2365
Its true that if you use a wildcard like *.txt and that expands into too many items (set in the lib somewhere I believe), you'll get the dreaded 'Too many arguments'.
The usual trick is to get all the testable files into a dedicated dir and then process in a loop

Code:
for file in `ls`
do
    test/rm $file here
done
may be a bit slower, but it'll work.
If you truly need more speed than this, write a prog in Perl: http://perldoc.perl.org/
 
Old 05-21-2008, 09:05 PM   #12
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 671Reputation: 671Reputation: 671Reputation: 671Reputation: 671Reputation: 671
Look at the man page for the xargs command. At 50,000 files, the number of arguments may be too large for bash or the shell. Remember that when a command runs, the arguments are in an array argv[]. There are some xarg options to limit the number of arguments that are handled at a time. ( -L, -n or -s options)

Also, there is an option of grep that stops searching a file after the first match. This can save considerable time if you are grep'ing a very long file and the match is near the beginning. ( edit: the -l, --files-with-matches will do this )

example:
Code:
find ${dirToSearch} -maxdepth 1 -type f -print0 | xargs -0 -n 500 egrep -Z -l '(<pattern1>|<pattern2>)' | xargs -0 -l 500 rm

Last edited by jschiwal; 05-21-2008 at 09:13 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Find/grep command to find matching files, print filename, then print matching content stefanlasiewski Programming 9 06-30-2016 05:30 PM
sed or grep : delete lines containing matching text raj000 Linux - General 18 09-08-2012 09:38 AM
delete all files with certain text dougp23 Linux - General 3 04-21-2008 01:17 PM
bash: better way to delete files not matching a regex? pbhj Programming 8 10-15-2007 03:05 PM
sed: delete text till <pattern2> depending on length of text oyarsamoh Programming 2 05-05-2007 01:40 AM


All times are GMT -5. The time now is 07:18 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration