LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Howto delete old directories based on dates in it's dirnames? (https://www.linuxquestions.org/questions/linux-general-1/howto-delete-old-directories-based-on-dates-in-its-dirnames-891443/)

meshuggah79 07-13-2011 04:13 AM

Howto delete old directories based on dates in it's dirnames?
 
Hi,

I have tried to find the solution for my problem on this site and other sites but haven't found a good enough answer yet. Maybe some of you can help me out here?

What i need is a script (bash preferrably) that can delete directories based on a date in its dirname.

For example.

I have a bunch of directories that is named

data-20110623/
data-20110624/
data-20110625/
etc.

I want to run a cronjob once every day that deletes directories older than 30 days based on its dirname.

Anyone wanna share their scriptskills? If so it's is very appreciated!

Cheers,
Meshuggah

b0uncer 07-13-2011 07:12 AM

If you just wanted to delete old files, you could simply use find with -mtime option to find files of certain age and with -exec option to execute rm on them, with needed options. If you insist on a script, I'll throw in some simple example which you can start working on...I'll assume the directories are indeed called data-YYYYMMDD as in your post, and they reside in the working directory (which you could easily change, of course).

Code:

#!/bin/bash

today=$(date -j -f "%a %b %d %T %Z %Y" "$(date)" "+%s")
month=2592000
reference=$(($today-$month))

filearray=(data-*)

for i in ${filearray[@]}
do
    num=$(date -j -f "%Y%m%d" ${i:5} "+%s")
    if [ $num -lt $reference ]
    then
        echo "$i is older than $reference"
    else
        echo "$i is not older than $reference"
    fi
done

What this should do (I did only few tests after writing this, so be aware!) is
1) take the current date and transform it into Epoch (number of seconds since UTC midnight Jan 1 1970)
2) calculate reference value that is 30 days older than that of today (30*24*60*60 seconds)
2) collect all files (directories) whose name begins with "data-" into an array
3) work through the array
- pick up the numerical date-part of the name of each element and transform it into Epoch
- compare the current value against the reference value, checking if it's smaller (=older) or not than the reference
- do something in either case, in the example simply echo the result.

Remember: this relies on the date formats being what they are thought to be. Make sure that it works the way you want before allowing it to remove anything. In fact, rather than directly delete, I'd personally make it move the to-be-removed directories into a "trash directory" first, and empty that a few days later. This would give you an opportunity to save files in case something went wrong.

This is a really simplistic example, but you can (and should) modify it to your needs. I suspect that other languages, for example Perl, would make this a lot easier--or then it's just me, but you insisted on bash. Replace the echo statements with actual working stuff (like rm), build a test case (so as not to destroy valuable data in case it does gimmicks), test, and so on.

Hope it helped a little.

meshuggah79 07-13-2011 07:35 AM

Hi,

Thank you for your input.

Yes unfortunately i need to rely on "date" in the directory names since atime, ctime, mtime etc. isn't relevant. It's a badly written script that restructures all directories
and renaming all of them once every day. And that script is out of my control.

So..

The example you gave me is almost what i need. Although I want to list (or remove) only the directories that is 30 days or older. (In this case it will be easy for me to change from listing dirs to removing them)

Regards,
Meshuggah

b0uncer 07-14-2011 02:04 AM

Quote:

Originally Posted by meshuggah79 (Post 4413716)
The example you gave me is almost what i need. Although I want to list (or remove) only the directories that is 30 days or older. (In this case it will be easy for me to change from listing dirs to removing them)

Good, then. I'm sure you'll find your way around with that skeleton of a script, if it indeed helps you. If you only need the "older than" criteria, you can just drop off the "else" statement and the following line, and that's it. If you need to be more precise about the selection criteria, change the if statement (less than/less or equal than). If you're more picky about the time difference, for example want to get full days and not consider hours or minutes, change the reference value (which in here is "today" as given by date) format to suit your needs--for example get simply the day, month and year. These changes are simple to do, but please list the files before removing as long as you modify the script, to be sure what it does.

There's another reason too not to trust "find" blindly: it if happened that someone (something) for example touch'ed some of the files, their date stamps would change, and they wouldn't then match the find criteria anymore. Or, if you added files and happened to preserve their timestamps that were oldish, you'd put them on the remove list too early. So in that light, this *could* be a more secure way; in any case, if you are afraid of loosing some valuable data, make sure you have backups rolling.

See

Code:

man date
for the date formats. Cheers!

sundialsvcs 07-14-2011 07:51 AM

The way that I would do it is:
  1. First, I would use Perl, or another high-level language that you are familiar with.
    • And I would be sure not to do this as root ... Just sayin'.
  2. I would first write a script that identifies the candidate files ... the "find" command can do this ... and writes a list of those filenames to a temporary file.
  3. Now, look at the :eek: :eek: file!!!" No, I didn't say, "just skip this step," or, "yeah, I'm sure it has the right stuff in it..." Recognize that you just might be ready to blow off a large foot, so you want to be damm sure it's the right one.
  4. Run another script which reads the temporary file and deletes the entries, then empties or deletes the temporary file.
  5. :cry: "Dammit..." Reach for a backup tape...
This, obviously, is a process that can be automated ... even to a "one-liner" involving the find command with the -exec option. ("There's More Than One Way To Do It ...") But I prefer that processes which are meant to be destructive should be very deliberate, even pedantic.

chrism01 07-14-2011 08:41 PM

Indeed! that's why I avoid one liners and usually go for a 'for file in ...' loop approach, so its easier to read, easier to debug and easier to add extra filtering inside the loop to fine tune name matching if needed.

++everything sundialsvcs said :)


All times are GMT -5. The time now is 03:23 AM.