LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-02-2020, 07:08 AM   #1
goodiemobster
LQ Newbie
 
Registered: Oct 2020
Posts: 5

Rep: Reputation: Disabled
deleting files in directories but with exclusions of some directories


Hi,
I'm trying to delete files older then 1 year on a server with +1000 vhosts.

location of the files:
/var/www/vhosts/CLIENT1/httpdocs/files/fotos/
/var/www/vhosts/CLIENT2/httpdocs/files/fotos/
...
/var/www/vhosts/CLIENT1000/httpdocs/files/fotos/

I used to do this with a simple:
find /var/www/vhosts/ -type f -name '*.jpg' -mtime +356 -exec rm {} \;

This worked very good but now some clients don't want these files removed.

Is there a way to exclude multiple /CLIENTX/httpdocs/files/fotos directories?
It would be great if I could put these in a blacklist.txt or something which i can simply adjust before running the removal command.

I'm new here and relatively new to linux, if i did not post this in the right forum, my excuses!
 
Old 10-02-2020, 07:23 AM   #2
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,692

Rep: Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274
https://stackoverflow.com/questions/...n-find-command
 
Old 10-02-2020, 07:33 AM   #3
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,258
Blog Entries: 3

Rep: Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713
If there is a pattern to the excluded directories such that you can make a relevant regular expression pattern, then you could use the -regex option:

Code:
find /var/www/vhosts/ -type f -name '*.jpg' -mtime +356 \
        -regextype posix-egrep \
        -not -regex '^/var/www/vhosts/CLIENT[0-9]+/httpdocs/files/fotos/.*' \
        -print
If that works, then append a -delete after the -print option. For the types of regular expression patterns supported, try find -regextype help or see the manual page.

Otherwise you can make a shell script to generate the find command.

Edit: There is an implied logical AND between all the find options. It doesn't have to be written but it is there.

Last edited by Turbocapitalist; 10-02-2020 at 07:35 AM.
 
Old 10-02-2020, 08:06 AM   #4
goodiemobster
LQ Newbie
 
Registered: Oct 2020
Posts: 5

Original Poster
Rep: Reputation: Disabled
tnx for the answers. I must say that unfortunately regex does not apply because the client names are all different (like companynames)

I'm looking for a nice overview also, so i can quickly read the clients who are excluded, instead of one long sentence with all directories after another one.
Mixing up some solutions provided by you, would this work then? (i will definitely test this also myself in a testing environment, but a quick hint is always welcome :-)


find /var/www/vhosts/ -type f -name '*.jpg' -mtime +356 -exec rm {} \;
-not \( -path /companyx -prune \) \
-not \( -path /firmZ -prune \) \
-not \( -path /othercompany12 -prune \) \
-not \( -path /somecompany \) \
-not \( -path /xyz \) \
-not \( -path /123 \) \
-not \( -path /list -prune \)
-not \( -path /goes -prune \)
-not \( -path /on -prune \)
 
Old 10-02-2020, 08:52 AM   #5
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,258
Blog Entries: 3

Rep: Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713Reputation: 3713
I'm not sure there's an easy way with find, if you are reading a file containing excluded directories. You could pipe things through some other utilities though and maybe the combined result does what you want.

Code:
find /var/www/vhosts/ -type f -print0 \
        | grep --invert-match --null-data --extended-regexp --file directories.to.exclude.txt \
        | xargs --null echo rm
The file 'directories.to.exclude.txt' would then contain the directory patterns to exclude. So it would be good to anchor them to the start of the line with a caret ^ on each one and treat them as absolute paths rather than relative paths.

Code:
cat << EOF > directories.to.exclude.txt
^/companyx
^/firmZ
^/othercompany12
^/somecompany
^/xyz 
^/123
^/list
^/goes
^/on
EOF
See "man grep" and "man xargs" for shorter options.
 
Old 10-02-2020, 08:58 AM   #6
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,768

Rep: Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192
The following comes to my mind - untested.
Code:
skipdirs=(
/companyx
/firmZ
/othercompany12
/list
/goes
/on
)
oIFS=$IFS
IFS="
"
prunelist= or=
for d in "${skipdirs[@]}"
do
  prunelist+="$or
-path
$d
"
  or="-o"
done
[ -n "$or" ] && prunelist="-type
d
(
$prunelist
)
-prune
-o
"
find /var/www/vhosts/ $prunelist -type f -name '*.jpg' -mtime +365 -atime +7 -exec echo rm {} \;
 
Old 10-02-2020, 09:13 AM   #7
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,692

Rep: Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274
it looks like (for me) we need a client specific setup/config and a script to process them one by one.
I would probably use something else instead of find and shell: perl/python/...
 
Old 10-04-2020, 12:32 AM   #8
X-LFS-2010
Member
 
Registered: Apr 2016
Posts: 510

Rep: Reputation: 58
> I'm trying to delete files older then 1 year on a server with +1000 vhosts.

#1 be VERY CAREFUL. PC's are infamous for "occasionally having the wrong time", either the PC clock is off at boot, or some files (that you moved or copied) have a wrong date (perhaps they were altered when the PC clock was off), etc.

in general: never do backups by time unless you don't care what is lost
 
1 members found this post helpful.
Old 10-04-2020, 10:27 AM   #9
computersavvy
Senior Member
 
Registered: Aug 2016
Posts: 3,327

Rep: Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481Reputation: 1481
Quote:
Originally Posted by goodiemobster View Post
Hi,
I'm trying to delete files older then 1 year on a server with +1000 vhosts.

location of the files:
/var/www/vhosts/CLIENT1/httpdocs/files/fotos/
/var/www/vhosts/CLIENT2/httpdocs/files/fotos/
...
/var/www/vhosts/CLIENT1000/httpdocs/files/fotos/

I used to do this with a simple:
find /var/www/vhosts/ -type f -name '*.jpg' -mtime +356 -exec rm {} \;

This worked very good but now some clients don't want these files removed.

Is there a way to exclude multiple /CLIENTX/httpdocs/files/fotos directories?
It would be great if I could put these in a blacklist.txt or something which i can simply adjust before running the removal command.

I'm new here and relatively new to linux, if i did not post this in the right forum, my excuses!

It would seem to me very easy to create a file with a list of all the clients who desire that the files be deleted in the format
CLIENT1
CLIENT2
CLIENT3
etc.

Yes, the initial creation of the file may take a while if done manually, but only a few minutes if done with a script which could pull the client name from the directory names, and maintenance should be easy.

Then add a while loop to the find command which reads the client name from the file, one at a time, and for each client do the find and delete similar to

Code:
read CLIENT from clientfile
while not end of clientfile
do
   find /var/www/vhosts/$CLIENT/httpdocs/files/fotos/ -type f -name '*.jpg' -mtime +356 -exec rm {} 
   read CLIENT from clientfile

done
something like this would make it easy to add or remove clients from the list whose files would be deleted by a simple edit of the client file.
 
Old 10-08-2020, 01:03 AM   #10
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,348

Rep: Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749
I was going to suggest basically the same thing; create a whitelist of those that DO need to be deleted and loop through it - much simpler than all that fancy negative matching; K.I.S.S

This would also be a good time to add a cmd at the end to check the disk space, as the uncleared companies' files are going to eat up the disk ...
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how to delete all files in a directory EXCEPT some specified exclusions? scottmusician Linux - Newbie 2 06-22-2017 04:34 AM
Auditd question - logging exclusions? charliebrownie Linux - Security 3 06-30-2011 12:00 AM
using grep -v / Grep exclusions dnoy Linux - Newbie 9 04-18-2009 11:53 PM
rsync, inclusions and exclusions djeikyb Linux - Software 4 03-04-2008 01:45 PM
Backup exclusions for RedHat 7.3 Web server? Tenover Linux - General 7 01-31-2006 10:29 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 01:59 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration