LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   BASH: how to automate deletion of oldest folder (http://www.linuxquestions.org/questions/programming-9/bash-how-to-automate-deletion-of-oldest-folder-383161/)

big_mike_jones 11-14-2005 05:37 PM

BASH: how to automate deletion of oldest folder
 
Background:

I work at a University Media Lab, runnning OS X 10.4 that is constantly plagued by cluttered desktops. I suggested we simply don't allow students to save anything to the Desktops, but that was quickly shot - down. Management discussed it and now I've been tasked with creating a simple BASH script to do the following:

Requirements:

1) Create a new folder on the Desktop called Desktop_as_of_... ; where "..." is the Date in the format mm-dd-yy
2) Move the contents of the Desktop, except folders from previous cleanups, to the newly created folder
3) Delete a folder when it 3 weeks old.

Progress:

I've managed to write a script that creates the folder and moves all of the the contents of the desktop into it. After some research, I managed to exclude any folder previously created by this script.

Problem: (i.e. Where I need help)

I can't figure out how to automatically delete a folder after a set amount of time. I thought about using the date in the folders name to sort alphabetically, but I run into a problem when the month goes from 12 to 1 (Dec. to Jan.)
My next idea was to use the date modified attribute to sort. The problem here, is the purpose of keeping the folders for 3 weeks is to allow students time to retrieve important files before they are automatically deleted. If a student transfers or deletes a file, the date modified attribute is updated.( at least I believe this to be true. Please let me know if I'm mistaken)
I thought I could use a date created attribute, but as of now, I can't find any evidence of its existance.

How you can help

Please give me any suggestions you have. Does date created exist? Is there a way to create my own timestamp? Is there some wierd regular expression I could use to filter? Is there some (open source / free ) program that does this already?

Thanks to anyone who offers advice, or even just read this whole thing

Andrew Benton 11-14-2005 05:53 PM

find /users/desktop -atime +20 -type d exec rm -rf '{}' \;

big_mike_jones 11-14-2005 05:59 PM

so -atime = access time right?

Doesn't this suffer from the same problem as date modified. THe purpose of having the folders stick around for 3 weeks is to allow ample time for studetns to retrieve files. Wouldn't someone copying a file from the folder reset the access time?

Thanks for the suggestion.

Dave Kelly 11-14-2005 06:45 PM

This is a long reply. Too long and for that I apoligize.

It is interesting how simular things require solving about the same time.

I am writing a script to check the update time on mirror sites. The problem you describe is the same as I have.

At each website is a timestamp.txt file with the last date everything was updateds. So for your solution, then when the folders are created and the archives are place into them you should create a timestamp and include it.

CAVEAT These scripts are not complete and have errors. But they will give you a start. toward reading the timestamp.

Code:

#!/bin/sh
## Read the datestamp on each TLDP mirror site and compare it with the datestamp on the main site to see if the mirror is up to date.
## If the dates vary more that 14 days then generate a report and send the offending site an email reminding them to update.
## The reports should show the sites that are OK, out of date and not responding.

main()
{
cat starlist | ( \
    IFS=* ; while read url admin; \
    do check_link $url $admin; done \
    )
}


check_link()
{
  url=$1
  admin_email=$2

  printf "URL is: %s\n" $url
  printf "Admin is: %s\n" $admin_email

        gettldp
                read linuxstamp < tldpdate
        getmirror
                read mirrorstamp < mirrordate
                printf "tldp %s\nmirror %s\n" "$linuxstamp" "$mirrorstamp"
        diffdate  "$mirrorstamp" "$linuxstamp"
        printf "The <"$url"> site is %s days out of date.\n" "$_DIFFDATE"
        }

gettldp()
{
        wget -q http://www.tldp.org/timestamp.txt -O tldptimestamp

        ## read the date record into a variable:

        while read day month num time zone year
        do
          monthnum "$month"
          ## extract label information and send it to stdout
          printf "$year $_MONTHNUM $num"
        done < tldptimestamp > tldpdate
}

getmirror()
{
        wget -q -T 15 -t 5 "$url"/timestamp.txt -O mirrortimestamp

        ## read the date record into a variable:

        while read day month num time zone year
        do
          monthnum "$month"
          ## extract label information and send it to stdout
          printf "$year $_MONTHNUM $num"
        done < mirrortimestamp > mirrordate
}
split_date()
{
  ## Assign defaults when no variable names are given on the command line
    sd_1=${2:-SD_YEAR}
    sd_2=${3:-SD_MONTH}
    sd_3=${4:-SD_DAY}


    oldIFS=$IFS        ## save current value of field separator
    IFS="-/. $TAB$NL"  ## new value allows date to be supplied in other formats
    set -- $1          ## place the date into the positional parameters
    IFS=$oldIFS        ## restore IFS
    [ $# -lt 3 ] && return 1  ## The date must have 3 fields

    ## Remove leading zeroes and assign to variables
    eval "$sd_1=\"${1#0}\" $sd_2=\"${2#0}\" $sd_3=\"${3#0}\""
}

_date2julian()
{
  ## If there's no date on the command line, use today's date
  case $1 in
        "") date_vars  ## From standard-funcs, Chapter 1
            set -- $TODAY
            ;;
  esac

  ## Break the date into year, month, and day
  split_date "$1" d2j_year d2j_month d2j_day || return 2
  ###printf "%s-%s  %s-%s  %s-%s" "$sd_1" "$s2j_year" "$sd_2" "$d2j_month" "$sd_3" "$d2j_day"

  ## Since leap years add a day at the end of February,
  ## calculations are done from 1 March 0000 (a fictional year)
  d2j_tmpmonth=$((12 * $d2j_year + $d2j_month - 3))

  ## If it is not yet March, the year is changed to the previous year
  d2j_tmpyear=$(( $d2j_tmpmonth / 12))

  ## The number of days from 1 March 0000 is calculated
  ## and the number of days from 1 Jan. 4713BC is added
  _DATE2JULIAN=$((
        (734 * $d2j_tmpmonth + 15) / 24 -  2 * $d2j_tmpyear + $d2j_tmpyear/4
        - $d2j_tmpyear/100 + $d2j_tmpyear/400 + $d2j_day + 1721119 ))
        ###printf "%s\n" "$_DATE2JULIAN"
}


date2julian()
{
    _date2julian "$1" && printf "%s\n" "$_DATE2JULIAN"
}

_diffdate()
{
        ###printf "%s++%s  " "$1" "$2"
    case $# in
        ## If there's only one argument, use today's date
        1) _date2julian "$1"
          dd2=$_DATE2JULIAN
          _date2julian
          dd1=$_DATE2JULIAN
          ;;
        2) _date2julian "$1"
          dd1=$_DATE2JULIAN
          _date2julian "$2"
          dd2=$_DATE2JULIAN
          ;;
    esac
    _DIFFDATE=$(( $dd2 - $dd1 ))
}

diffdate()
{
    _diffdate "$@" && printf "%sxxxxxxxxxxxx\n" "$_DIFFDATE"
}

## Set the month number from 1- or 2-digit number, or the name
_monthnum()
{
    case ${1#0} in
        1|[Jj][aA][nN]*) _MONTHNUM=1 ;;
        2|[Ff][Ee][Bb]*) _MONTHNUM=2 ;;
        3|[Mm][Aa][Rr]*) _MONTHNUM=3 ;;
        4|[Aa][Pp][Rr]*) _MONTHNUM=4 ;;
        5|[Mm][Aa][Yy]*) _MONTHNUM=5 ;;
        6|[Jj][Uu][Nn]*) _MONTHNUM=6 ;;
        7|[Jj][Uu][Ll]*) _MONTHNUM=7 ;;
        8|[Aa][Uu][Gg]*) _MONTHNUM=8 ;;
        9|[Ss][Ee][Pp]*) _MONTHNUM=9 ;;
        10|[Oo][Cc][Tt]*) _MONTHNUM=10 ;;
        11|[Nn][Oo][Vv]*) _MONTHNUM=11 ;;
        12|[Dd][Ee][Cc]*) _MONTHNUM=12 ;;
        *) return 5 ;;
    esac
}

monthnum()
{
  _monthnum "$@" && printf "%s\n" "$_MONTHNUM"
}

date_vars()
{
    eval $(date "$@" "+DATE=%Y-%m-%d
                      YEAR=%Y
                      MONTH=%m
                      DAY=%d
                      TIME=%H:%M:%S
                      HOUR=%H
                      MINUTE=%M
                      SECOND=%S
                      datestamp=%Y-%m-%d_%H.%M.%S
                      DayOfWeek=%a
                      DayOfYear=%j
                      DayNum=%w
                      MonthAbbrev=%b")

    ## Remove leading zeroes for use in arithmetic expressions
    _MONTH=${MONTH#0}
    _DAY=${DAY#0}
    _HOUR=${HOUR#0}
    _MINUTE=${MINUTE#0}
    _SECOND=${SECOND#0}


    ## Sometimes the variable, TODAY, is more appropriate in the context of a## Set the month number from 1- or 2-digit number, or the name

    ## particular script, so it is created as a synonym for $DATE
    TODAY=$DATE

    export DATE YEAR MONTH DAY TODAY TIME HOUR MINUTE SECOND
    export datestamp MonthAbbrev DayOfWeek DayNum
}

main "$@"

These scripts were taken from 'Shell Scripting Recipes' by Chris F. A. Johnson.
The sccript are available on the Apress website (http://www.apress.com) in the download section. The author has a web site at http://cfaj.freeshell.org and can be reached by email ar cfaj@freeshell.org.

big_mike_jones 11-14-2005 06:56 PM

Hmmm. That script is a little out of my league. I'm very new to BASH, and relatively inexperienced in programming. That said, Thanks for the suggestion and I hope you get your problem worked out.

Any other suggestions out there?

Also. The spell checker on this site is amazing. Kudos to whomever put it together.

chrism01 11-14-2005 09:37 PM

Actually, -mtime is the last time file was modified (ie contents changed), which means what it says, ie if the file is copied to elsewhere, then it (src file) has not been modified, so mtime does not change.
HTH

big_mike_jones 11-14-2005 09:47 PM

Interesting...

would -mtime be changed if a file was deleted from the folder? This is definitely a possibility as access would not be restricted to the folders.

Obviously, for this script to run automatically there can't be any chance of the wrong folder being deleted.

Again, thanks for the suggestion. I really appreciate people taking time out of their day to help me out.

Andrew Benton 11-15-2005 03:50 AM

Quote:

Originally posted by big_mike_jones
so -atime = access time right?
Read
man find

big_mike_jones 11-15-2005 02:32 PM

From man find:

If no units are specified, this primary evaluates to true if the difference between the file last access time and the time find was started.

So I'm right? Andrew, what events cause -atime to be changed (moving, deletion, ...) ?

I know it is almost automatic to send people to the help page,or in this case man page. But I promise I have read anything I could find that appeared relevant. Notice I said read, not understood. Anyway, after a good nights sleep, here is my new plan:

1) Use the script I have right so far to create a folder named Desktop_as_of_mm-dd-yyyy, and put all files on the desktop in to it.
2) When the script runs, once a week. Strip the date from each folder name (using regex, I think. Anyone have a good way to do this?)
3) convert that date into Unix time
4) If the UNIX time is less than current UNIX time - 3 weeks worth of seconds.
5) then the folder is deleted


What do the experts here think? Does anyone have neat trick to easily extract the date info from the folder name? I'm guessing regular expressions are the way to go, but my experience with them i limited to *.*

Thanks for all the help.

schneidz 11-15-2005 03:08 PM

when i worked in a college computer lab, we had a shared drive that would be accessed by all pc's. but that didnt stop people from dl'ing stuff so we would norton ghost each lab once a week.

the easiest way to do what you want to do is append the day of year to the end of the directory. then you can cron a script that would subtract the day the file was created from the output of `date +%j`.

big_mike_jones 11-15-2005 06:34 PM

OK,

using substring extraction, I have been able strip out day, month, and year each to their own variable. Now, I'm having trouble getting this into UNIX time. From the Man Pages, it seems that I should use mktime or possibly strftime. The problem I'm running into is that it seems that both of these commands need a library, called time.h, loaded. I can't find any information on how to do this.

Is there a command I can use to transform month, date, and year into UNIX time easily, in BASH.

Thanks for any help offered.

chrism01 11-16-2005 12:37 AM

You could just use:
-mtime -type d
as part of your find cmd. This would apply the comparison to dirs only ( -type d ), then just
rm -rf <dirname>
which would save you having to parse the dirname at all...

big_mike_jones 11-16-2005 08:08 PM

Hi Everyone,

So after a weeks worth of headaches working on this thing. I have given up on BASH. Once I realized that I could use PHP by changing the she-bang line, I knocked this out pretty quickly. If anyone cares, which is doubtful, here is the script as it stands.


#!/usr/local/php5/bin/php
<?php
$user = "...";
$timelimit = 1728000;

$date = getdate(); //$date is array containg the date
$foldername ="/Users/$user/Desktop/Desktop_as_of_" . $date["mon"] . "-" . $date["mday"] . "-" . $date["year"];

mkdir($foldername); //make backup directory

// put list all of files on Desktop into array called $files
// NOTE: scandir needs PHP5 to work
$files = scandir("/Users/$user/Desktop");

foreach ($files as $filename) {

// =0 if file starts with .
$is_hidden = strpos($filename, ".");

// $is_desktop is set if $filename is a previous backup folder
$is_desktop=preg_match('/^Desktop_as_of_[0-9]{2}-[0-9]{2}-[0-9]{4}$/', $filename);

// have to use === so that directories or files without a . don't
//give false positives. Since boolean false can also be 0
if ($is_hidden === 0) {
continue;
}
elseif (isset($is_desktop) && is_dir("/Users/$user/Desktop/$filename")) {
continue;
}

//move files on desktop in to proper backup folder
//use shell commands, becuase moving folders in php in hard
`mv "/Users/$user/Desktop/$filename" "$foldername/$filename"`;
}

//put list of all back folders on desktop into array called $folders
$folders = scandir("/Users/$user/Desktop");

foreach ($folders as $f_name) {

$is_hidden = strpos($f_name, ".");
if ($is_hidden === 0) {
continue;
}

//put date from folder into array called datestring
//then put date info into matching variable
preg_match('/[0-9]{2}-[0-9]{2}-[0-9]{4}/', $f_name, $datestring);
list($month, $day, $year) = explode("-", $datestring[0]);

//change date from folder and change to unix time
$unixtime = mktime(0, 0, 0, $month, $day, $year);

//delete if older then limit set by timelimit variable at beginning
//1728000 seconds = 20 days
//using rm becuase rmdir doesn't work if the directory is not empty
if ($unixtime < (time() - $timelimit)) {
`rm -rf "/Users/$user/Desktop/$f_name"`;
}
}

?>

Thanks to anyone who took time out of their day to give me a suggestion. If anyone has any comments or ideas for the above script. Please send them my way.

bigearsbilly 11-17-2005 03:33 AM

incidentally,
IMHO it's always best to use
YYYYMMDD format for dates as they always
yield correctly to sorting.

bigearsbilly 11-17-2005 03:39 AM

I've done this sort of thing before.

Assuming files are in some form of date spec YYYYMMDD.

Go through the files, then using the name
touch -t YYYYMMDD the directory
to reset the files timestamp.
Then it's a simple matter of using find -ctime +n| xargs rm

or if it's just the oldest to rm it's easy to sort -n on the YYYYMMDD
format.


All times are GMT -5. The time now is 04:18 PM.