LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 01-17-2012, 11:39 AM   #16
gunnarflax
LQ Newbie
 
Registered: Nov 2010
Posts: 20

Original Poster
Rep: Reputation: 4

Quote:
Originally Posted by padeen View Post
You haven't given find an exec action. -exec some_command {} \+

{} is a placeholder for all the files that find finds. + means pass them all through at once.
Ok, I implemented it in a similar way to my old script since what find does in my script is:

I use Find to find all files within the date range, then I delete the first row from the returned string to make sure that I keep one file for the date range. Can this be done within the Find command with some option like "skip first row" or something? After that I iterate through all files stored in a variable and delete them one by one.

Because I need to save one file per date range I don't think I can utilize the -exec function of Find. Please take a look at the code I have now and see if I can utilize it better.

Unfortunately I also have a bug which I cannot find. My previous script managed to delete properly so that I kept files according to the pattern I wanted. My new implementation with find strangely saves 406 files instead of 20-30 that my other one did. Please help me spot bugs:

Code:
smart_rm ()
{
	#If wrong number of parameters been specified exit
	if [ -z "$1" ]; then
		echo "$ISO_DATETIME [ERROR]: You must specify a directory to clean."
		return 1
	fi

	local TRGT_DIR=$1

	#Target must be a directory
	if [ ! -d "$TRGT_DIR" ]; then
		echo "$ISO_DATETIME [ERROR]: The target must exist and be a directory."
		return 1
	fi

	#Make sure that the path ends with /
	if [ "${TRGT_DIR#${TRGT_DIR%?}}" != "/" ]; then
		TRGT_DIR="${TRGT_DIR}/"
	fi

	#Files to delete
	local FILES_TO_DELETE
	#Set a minimum age for files to be deleted
	local DAY_RM_THRESHOLD=2
	local DAY_SPAN=1
	local DAY_RM_LIMIT=
	local FILES=
	local FILE_COUNT=

	#Loop as long as there are older files
	while [ $(find "$TRGT_DIR"* -daystart -mtime +$DAY_RM_THRESHOLD | wc -l) -gt 0 ]
	do
		if [ $DAY_RM_THRESHOLD -le 7 ]; then
			FILES=$(find "$TRGT_DIR"* -daystart -mtime $DAY_RM_THRESHOLD)
		else
			DAY_RM_LIMIT=$(($DAY_RM_THRESHOLD+$DAY_SPAN))
			FILES=$(find "$TRGT_DIR"* -daystart \( -mtime +$DAY_RM_THRESHOLD -a -mtime -$DAY_RM_LIMIT \) )
		fi

		#Select files to delete
		FILE_COUNT=$(echo "$FILES" | wc -l )

		#Add all except the first to the delete array
		for FILE in $(echo "$FILES" | sed -n 2,"$FILE_COUNT"p)
		do
			FILES_TO_DELETE=(${FILES_TO_DELETE[@]} "$FILE")
		done

		#Increase the day span accordingly
		if [ $DAY_RM_THRESHOLD -lt 7 ]; then
			DAY_SPAN=1
			echo "INCREASE DAY"
		elif [ $DAY_RM_THRESHOLD -ge 7 ] && [ $DAY_RM_THRESHOLD -lt 28 ]; then
			DAY_SPAN=7
			echo "INCREASE WEEK"
		elif [ $DAY_RM_THRESHOLD -ge 28 ] && [ $DAY_RM_THRESHOLD -lt 365 ]; then
			DAY_SPAN=30
			echo "INCREASE MONTH"
		else
			DAY_SPAN=365
			echo "INCREASE YEAR"
		fi

		#Increase the age threshold
		DAY_RM_THRESHOLD=$(($DAY_RM_THRESHOLD+$DAY_SPAN))
	done

	#Show result
	#for FILE in ${FILES_TO_DELETE[@]}
	#do
	#	echo $FILE
	#done

	echo $(ls "$TRGT_DIR" | wc -l)
	echo ${#FILES_TO_DELETE[@]}

	#Delete the selected files
	for FILE in ${FILES_TO_DELETE[@]}
	do
		rm -R $FILE
	done

	echo $(ls "$TRGT_DIR" | wc -l)
}

Last edited by gunnarflax; 01-17-2012 at 11:40 AM. Reason: put the command find in bold
 
Old 01-17-2012, 07:40 PM   #17
gunnarflax
LQ Newbie
 
Registered: Nov 2010
Posts: 20

Original Poster
Rep: Reputation: 4
Now I got a fully functioning version with the "ls"-method! It is as follows:
Code:
        #Files to delete
	local FILES_TO_DELETE
	#Set a minimum age for files to be deleted
	local DAY_RM_THRESHOLD=2
	local DAY_SPAN=1
	#Create the controller for found files
	local FOUND_ONE=1

	#Loop as long as there are files to examine
	for FILE in $(ls -t $TRGT_DIR)
	do
		#Get the file's modification date
		FILE="$TRGT_DIR$FILE"
		MTIME=$(date -d "$(stat -c %y $FILE)" +%s)

		#Increase the day span accordingly
		if [ $DAY_RM_THRESHOLD -lt 7 ]; then
			DAY_SPAN=1
		elif [ $DAY_RM_THRESHOLD -ge 7 ] && [ $DAY_RM_THRESHOLD -lt 28 ]; then
			DAY_SPAN=7
		elif [ $DAY_RM_THRESHOLD -ge 28 ] && [ $DAY_RM_THRESHOLD -lt $((28+30*11)) ]; then
			DAY_SPAN=30
		else
			DAY_SPAN=365
		fi

		#If the file's modification time is earlier then our date range we push it back one $DAY_SPAN
		if [ $MTIME -lt $(date -d "$(($DAY_RM_THRESHOLD+$DAY_SPAN)) days ago" +%s) ]; then
			DAY_RM_THRESHOLD=$(($DAY_RM_THRESHOLD+$DAY_SPAN))
			FOUND_ONE=1
		fi

		#Get date range
		DATE_END=$(date -d "$DAY_RM_THRESHOLD days ago" +%s)
		DATE_START=$(date -d "$(($DAY_RM_THRESHOLD+$DAY_SPAN)) days ago" +%s)

		#Have we found one to keep for this day?
		if [ $FOUND_ONE -eq 1 ] && [ $MTIME -ge $DATE_START ] && [ $MTIME -lt $DATE_END ]; then
			FOUND_ONE=0
		else
			FILES_TO_DELETE=(${FILES_TO_DELETE[@]} "$FILE")
		fi	
	done

	#Delete the selected files
	for FILE in ${FILES_TO_DELETE[@]}
	do
		rm -R $FILE
	done
Though I would very much like to know why my "find"-version don't select all files. The find-version removes about 400 files to few. The ls-version is basically the same as the find so I don't understand what can cause this strange behaviour. Can I get some feedback on these two scripts and why find behaves so differently?

find:
Code:
        #Files to delete
	local FILES_TO_DELETE
	#Set a minimum age for files to be deleted
	local DAY_RM_THRESHOLD=2
	local DAY_SPAN=1
	local FILES=
	local LINE_COUNT=

	#Loop as long as there are older files
	while [ $(find "$TRGT_DIR"* -daystart -mtime +$DAY_RM_THRESHOLD | wc -l) -gt 0 ]
	do
		if [ $DAY_RM_THRESHOLD -le 7 ]; then
			FILES=$(ls -t $(find "$TRGT_DIR"* -daystart -mtime $DAY_RM_THRESHOLD))
		else
			FILES=$(ls -t $(find "$TRGT_DIR"* -daystart \( -mtime +$(($DAY_RM_THRESHOLD)) -a -mtime -$(($DAY_RM_THRESHOLD+$DAY_SPAN)) \) ))
		fi

		#Select files to delete
		LINE_COUNT=$(echo "$FILES" | wc -l )
		#echo $LINE_COUNT

		#Add all files except the first to the delete array
		for FILE in $(echo "$FILES" | tail -n $(($LINE_COUNT)))
		do
			#echo $FILE
			FILES_TO_DELETE=(${FILES_TO_DELETE[@]} "$FILE")
		done

		#Increase the day span accordingly
		if [ $DAY_RM_THRESHOLD -lt 7 ]; then
			DAY_SPAN=1
			echo "INCREASE DAY"
		elif [ $DAY_RM_THRESHOLD -ge 7 ] && [ $DAY_RM_THRESHOLD -lt 28 ]; then
			DAY_SPAN=7
			echo "INCREASE WEEK"
		elif [ $DAY_RM_THRESHOLD -ge 28 ] && [ $DAY_RM_THRESHOLD -lt $((28+30*11)) ]; then
			DAY_SPAN=30
			echo "INCREASE MONTH"
		else
			DAY_SPAN=365
			echo "INCREASE YEAR"
		fi

		#echo $DAY_SPAN
		#echo $DAY_RM_THRESHOLD

		#Increase the age threshold
		DAY_RM_THRESHOLD=$(($DAY_RM_THRESHOLD+$DAY_SPAN))
	done

	#Delete the selected files
	for FILE in ${FILES_TO_DELETE[@]}
	do
		rm -R $FILE
	done
 
Old 01-17-2012, 09:13 PM   #18
kakaka
Member
 
Registered: Sep 2003
Posts: 382

Rep: Reputation: 86
find operates recursively unless you tell it not to, ls operates recursively if you tell it to.

So in most situations you wouldn't put the "star"/"asterisk" pattern matching char after the directory name with find.

Typically find would be used:

Code:
find "$TRGT_DIR"
not

Code:
find "$TRGT_DIR"*
In some situations, using the asterisk, might effectively cause duplicates of file names in the list of files.
 
Old 01-17-2012, 09:40 PM   #19
kakaka
Member
 
Registered: Sep 2003
Posts: 382

Rep: Reputation: 86
Another possible issue is what is sometimes called the "command buffer", or "argument length". That's why, although it may not have seemed "elegant", I illustrated the output of an ls command being read as variables, and acting on a single file name per loop iteration, rather than build a single "long" command with a list of file names. The list of file names may grow to be too long, depending on your exact situation.

That's also why using the exec option of find, or using find with the xargs command can be so nice, since it puts the list of files through a pipe, not in a command buffer, which may be implemented with length limitations.

Last edited by kakaka; 01-17-2012 at 09:43 PM.
 
Old 01-18-2012, 07:39 AM   #20
gunnarflax
LQ Newbie
 
Registered: Nov 2010
Posts: 20

Original Poster
Rep: Reputation: 4
I would like to get it working with find but I can't find a working way to implement it. My ls-version works rather well right now. The only issue I have, which isn't a problem for me right now but could if the script would be used on another directory, is that I can't process file names with spaces in them. I loop on the result I get from ls and the loop apparently don't process line per line but processes word per word instead. Is there some way to solve this? Can I do the loop on some other way? Maybe pipe the ls result into something else?

Here is my script as it is right now with ls:

Code:
smart_rm_backups ()
{
	local TRGT_DIR=""
	local DAY_RM_THRESHOLD=2
	local DAY_SPAN=1
	local DIRECTORIES=1

	while getopts ":p:t:d" opt; do
		case $opt in
			p)
				TRGT_DIR=$OPTARG
			;;
			t)
				if [ $OPTARG -lt 7 ]; then
					DAY_RM_THRESHOLD=$OPTARG
				else
					DAY_RM_THRESHOLD=7
					DAY_SPAN=7
				fi
			;;
			d)
				DIRECTORIES=0
			;;
		esac
	done

	#Target must be a directory
	if [ ! -d "$TRGT_DIR" ]; then
		echo "$ISO_DATETIME [ERROR]: The target must exist and be a directory."
		return 1
	fi

	#Make sure that the path ends with /
	if [ "${TRGT_DIR#${TRGT_DIR%?}}" != "/" ]; then
		TRGT_DIR="${TRGT_DIR}/"
	fi

	#Files to delete
	local FILES_TO_DELETE
	local FOUND_ONE=1

	#Loop as long as there are files to examine
	for FILE in $(ls -1 -t $TRGT_DIR -I "*~")
	do
		#Get the file's modification date
		FILE="$TRGT_DIR$FILE"
		MTIME=$(date -d "$(stat -c %y $FILE)" +%s)

		#Check if we should skip directories
		if [ $DIRECTORIES -eq 1 ] && [ -d "$FILE" ]; then
			continue
		fi

		#Increase the day span accordingly
		if [ $DAY_RM_THRESHOLD -lt 7 ]; then
			DAY_SPAN=1
		elif [ $DAY_RM_THRESHOLD -ge 7 ] && [ $DAY_RM_THRESHOLD -lt 28 ]; then
			DAY_SPAN=7
		elif [ $DAY_RM_THRESHOLD -ge 28 ] && [ $DAY_RM_THRESHOLD -lt $((28+30*11)) ]; then
			DAY_SPAN=30
		else
			DAY_SPAN=365
		fi

		#If the file's modification time is earlier than our date range we push it back one $DAY_SPAN
		if [ $MTIME -lt $(date -d "$(($DAY_RM_THRESHOLD+$DAY_SPAN)) days ago" +%s) ]; then
			DAY_RM_THRESHOLD=$(($DAY_RM_THRESHOLD+$DAY_SPAN))
			FOUND_ONE=1
		fi

		#Get date range
		DATE_END=$(date -d "$DAY_RM_THRESHOLD days ago" +%s)
		DATE_START=$(date -d "$(($DAY_RM_THRESHOLD+$DAY_SPAN)) days ago" +%s)

		#Have we found one to keep for this day?
		if [ $FOUND_ONE -eq 1 ] && [ $MTIME -ge $DATE_START ] && [ $MTIME -lt $DATE_END ]; then
			FOUND_ONE=0
		else
			FILES_TO_DELETE=(${FILES_TO_DELETE[@]} "$FILE")
		fi	
	done

	#Delete the selected files
	for FILE in ${FILES_TO_DELETE[@]}
	do
		rm -R $FILE
	done
}
 
Old 01-18-2012, 08:21 AM   #21
Cedrik
Senior Member
 
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140

Rep: Reputation: 242Reputation: 242Reputation: 242
Maybe use read ?
Code:
ls -1 -t $TRGT_DIR -I "*~" | while read FILE; do
...
Also don't forget to quote $FILE everywhere
Code:
MTIME=$(date -d "$(stat -c %y "$FILE")" +%s)
 
Old 01-18-2012, 10:21 AM   #22
gunnarflax
LQ Newbie
 
Registered: Nov 2010
Posts: 20

Original Poster
Rep: Reputation: 4
Quote:
Originally Posted by Cedrik View Post
Maybe use read ?
Code:
ls -1 -t $TRGT_DIR -I "*~" | while read FILE; do
...
Also don't forget to quote $FILE everywhere
Code:
MTIME=$(date -d "$(stat -c %y "$FILE")" +%s)
Thank you! That solved it, though I had some trouble finding out that read was in a subshell and couldn't set variables so I had to output it all to a temporary file.

This isn't a very elegant solution so if someone knows how to get this done find, please let me know this is what I got now:

Code:
smart_rm_backups ()
{
	local TRGT_DIR=""
	local DAY_RM_THRESHOLD=2
	local DAY_SPAN=1
	local DIRECTORIES=1
	local FOUND_ONE=1

	while getopts ":p:t:d" opt; do
		case $opt in
			p)
				TRGT_DIR=$OPTARG
			;;
			t)
				if [ $OPTARG -lt 7 ]; then
					DAY_RM_THRESHOLD=$OPTARG
				else
					DAY_RM_THRESHOLD=7
					DAY_SPAN=7
				fi
			;;
			d)
				DIRECTORIES=0
			;;
		esac
	done

	#Target must be a directory
	if [ ! -d "$TRGT_DIR" ]; then
		echo "$ISO_DATETIME [ERROR]: The target must exist and be a directory."
		return 1
	fi

	#Make sure that the path ends with /
	if [ "${TRGT_DIR#${TRGT_DIR%?}}" != "/" ]; then
		TRGT_DIR="${TRGT_DIR}/"
	fi

	#Find files to remove and put them in "files_to_remove.tmp"
	ls -1 -t $TRGT_DIR -I "*~" | while read FILE
	do
		#Get the file's modification date
		FILE="$TRGT_DIR$FILE"
		MTIME=$(date -d "$(stat -c %y "$FILE")" +%s)

		#Check if we should skip directories
		if [ $DIRECTORIES -eq 1 ] && [ -d "$FILE" ]; then
			continue
		fi

		#Increase the day span accordingly
		if [ $DAY_RM_THRESHOLD -lt 7 ]; then
			DAY_SPAN=1
		elif [ $DAY_RM_THRESHOLD -ge 7 ] && [ $DAY_RM_THRESHOLD -lt 28 ]; then
			DAY_SPAN=7
		elif [ $DAY_RM_THRESHOLD -ge 28 ] && [ $DAY_RM_THRESHOLD -lt $((28+30*11)) ]; then
			DAY_SPAN=30
		else
			DAY_SPAN=365
		fi

		#If the file's modification time is earlier than our date range we push it back one $DAY_SPAN
		if [ $MTIME -lt $(date -d "$(($DAY_RM_THRESHOLD+$DAY_SPAN)) days ago" +%s) ]; then
			DAY_RM_THRESHOLD=$(($DAY_RM_THRESHOLD+$DAY_SPAN))
			FOUND_ONE=1
		fi

		#Get date range
		DATE_END=$(date -d "$DAY_RM_THRESHOLD days ago" +%s)
		DATE_START=$(date -d "$(($DAY_RM_THRESHOLD+$DAY_SPAN)) days ago" +%s)

		#Have we found one to keep for this day?
		if [ $FOUND_ONE -eq 1 ] && [ $MTIME -ge $DATE_START ] && [ $MTIME -lt $DATE_END ]; then
			FOUND_ONE=0
		else
			echo "$FILE"
		fi	
	done > "files_to_delete.tmp"

	#Delete the files
	OLDIFS=$IFS
	IFS=$'\n'

	cat "files_to_delete.tmp" | while read FILE
	do
		rm -R $FILE
	done

	#Reset IFS
	IFS=$OLDIFS

	#Remove the temporary file containing what to remove
	rm "files_to_delete.tmp"
}
 
Old 01-18-2012, 04:16 PM   #23
kakaka
Member
 
Registered: Sep 2003
Posts: 382

Rep: Reputation: 86
Arrow

Could you just put the "rm" in place of the "echo", eliminate the other loop and the temporary file?

Code:
                .  .  .

		#Have we found one to keep for this day?
		if [ $FOUND_ONE -eq 1 ] && [ $MTIME -ge $DATE_START ] && [ $MTIME -lt $DATE_END ]; then
			FOUND_ONE=0
		else
		    rm -R "$FILE"
                fi
	done
}
As has effectively already been mentioned, if you've got spaces within the file names, you should quote virtually any use of the FILE variable value.

Another thought, there are different types of elegance. Someone could have a program with 100 lines of code that uses "brute force" approaches in the code, which might not seem very elegant, at a code level. Perhaps the program could be changed to use more "sophisticated" approaches, which result in a program with only 10 lines of code. The program with 10 lines of code, might seem to have more elegant code. But if the program that is more elegant at a code level, instead consumes 10 times as much of the computer's horsepower than the approach that it is brute force at a code level, then the "elegant" code is not elegant in it's use of the computer's resources.

With a shell script, typically, using something built into the shell to do the same thing as an external program, takes less of the computer's horsepower. Reading values into variables may seem rather brute-force/not-elegant, but if it saves running programs external to the shell, it may be elegant as far as it's use of the computer's horsepower.

Last edited by kakaka; 01-18-2012 at 04:55 PM.
 
Old 01-18-2012, 05:06 PM   #24
gunnarflax
LQ Newbie
 
Registered: Nov 2010
Posts: 20

Original Poster
Rep: Reputation: 4
I guess I can. It just doesn't feel right to remove a file before knowing exactly what to delete I can also pipe into xargs at the end of the loop. Would it be better to run rm once for every file or just once for all files?
 
Old 01-18-2012, 07:13 PM   #25
kakaka
Member
 
Registered: Sep 2003
Posts: 382

Rep: Reputation: 86
Quote:
Originally Posted by gunnarflax View Post
I guess I can. It just doesn't feel right to remove a file before knowing exactly what to delete I can also pipe into xargs at the end of the loop. Would it be better to run rm once for every file or just once for all files?
Maybe I'm in too much of a hurry, so misinterpreting the shell script code you've shown us, but it appears that when you are ready to echo the file name, you do know what file you wish to delete. By echoing the file name, you are placing that file name into the temporary file, which contains the list of file names to be deleted, yes? Don't you already run "rm" in a loop, a separate loop after the main loop. If you don't know how many characters worth of files names you might have, that's where that command buffer size or argument length limitation comes into play. If you try to remove all the files you want to remove with a single command, the script might fail on an error, because you exceed such limit(s).

If you could assure that file names on your system had at most one space in sequence, not two, and that your script would not be running when "midnight" occurs, you could use code such as this to get the "dates" for all backup files all at once:

Code:
#!/bin/bash

declare -a file_info

# Get all file modification dates as seconds since Unix/Linux Epoch,
# with a single command.
# Then eliminate output columns apart from date and file name.
# Handle file name as portion of array during read, to account
# for possible space in file name.
ls  -1lt -I "*~"  --time-style +%s  $TRGT_DIR  |  tail +2  |  cut -c36-  |  while read -a file_info
        do
                file_date=${file_info[0]}
                # Remove file date from array.
                unset file_info[0]
                # Concatenate array elements to form file name, handles single spaces in file names, not two spaces in sequence.
                file_name="${file_info[@]}"
                echo "file_name: '$file_name', date as secs. since Epoch: $file_date"
        done
For each date threshold, store the current number of seconds since the Epoch in a some variables, so avoid the use of the external date and stat commands inside the loop.

Last edited by kakaka; 01-18-2012 at 07:16 PM.
 
Old 02-01-2012, 10:17 PM   #26
padeen
Member
 
Registered: Sep 2009
Location: Perth, W.A.
Distribution: Slackware 14, Debian 7, FreeBSD, OpenBSD
Posts: 179

Rep: Reputation: 35
Quote:
Originally Posted by kakaka View Post
command buffer size or argument length limitation
Just FYI, it's pretty unlikely nowadays, since kernel 2.6.23. Most implementations have it around the 2MB size, which is a lot of file names! man execve(2) and http://stackoverflow.com/questions/1...variable-value
 
Old 02-02-2012, 12:05 AM   #27
kakaka
Member
 
Registered: Sep 2003
Posts: 382

Rep: Reputation: 86
Well, I'm using kernel 2.6.37, and I get the error often enough on average to remind me that it's there.

After all, if you process plenty of full path names, as output by commands such as locate, or find, a recursive grep with plenty of result file names, the total number of characters can add up fairly quickly.
 
Old 02-02-2012, 04:41 AM   #28
gunnarflax
LQ Newbie
 
Registered: Nov 2010
Posts: 20

Original Poster
Rep: Reputation: 4
I've finished the script and it works quite well, I'll post it here as soon as I can. Thanks for all the help!
 
Old 02-02-2012, 12:30 PM   #29
gunnarflax
LQ Newbie
 
Registered: Nov 2010
Posts: 20

Original Poster
Rep: Reputation: 4
I've attached the final code to this post. I've never put something under a license before but I thought it'd be nice to do it. I simply followed the instructions here: http://www.gnu.org/licenses/gpl-howto.html

I couldn't attach an archive including the license but I think that's ok. Please let me know if I did something wrong

Code:
#!/bin/bash

#----------------------------------------------#
#    Copyright (C) Niklas Rosenqvist, 2012
#----------------------------------------------#
#
#    This program is free software: you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation, either version 3 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program.  If not, see <http://www.gnu.org/licenses/>.

#----------------------------#
# Smart rm old files
#----------------------------#

# -t: = path to directory to clean
# -a: = amount of days to save all files (default = 2, max 7)
# -d = delete directories as well

show_error ()
{
	echo "${PROG_NAME}: ${1:-"Unknown error"}"
}

PROG_NAME=$(basename $0)
TRGT_DIR=""
DAY_RM_THRESHOLD=2
DAY_SPAN=1
DAYS_TO_SAVE=$(date -d "2 days ago" +%s)
DIRECTORIES=1
FOUND_ONE=1

while getopts ":t:a:d" opt; do
	case $opt in
		t)
			TRGT_DIR=$OPTARG
		;;
		a)
			#Check if age is an integer
			if ! [[ "$OPTARG" =~ [0-97]+$ ]]; then
				show_error "[ERROR]: Age (-a) must be an integer with the value 1-7"
				exit 1
			fi

			if [ $OPTARG -lt 7 ]; then
				DAY_RM_THRESHOLD=$OPTARG
				DAYS_TO_SAVE=$(date -d "$(($OPTARG)) days ago" +%s)
			else
				DAY_RM_THRESHOLD=7
				DAYS_TO_SAVE=$(date -d "7 days ago" +%s)
				DAY_SPAN=7
			fi
		;;
		d)
			DIRECTORIES=0
		;;
	esac
done

#Reset $OPTIND
OPTIND=1

#Target must be a directory
if [ ! -d "$TRGT_DIR" ]; then
	show_error "[ERROR]: The target must exist and be a directory." 
	exit 1
fi

echo "[INFO]: Starting the logarithmic backup cleaning."

#Find files to remove and put them in "files_to_remove.tmp"
ls -1 -t "$TRGT_DIR/" -I "*~" | while read FILE
do
	#Get the file's modification date
	FILE="$TRGT_DIR/$FILE"
	MTIME=$(date -d "$(stat -c %y "$FILE")" +%s)

	#If it's within the range to save all files we skip this one
	if [ $MTIME -ge $DAYS_TO_SAVE ]; then
		continue
	fi

	#Check if we should skip directories
	if [ $DIRECTORIES -eq 1 ] && [ -d "$FILE" ]; then
		continue
	fi

	#Increase the day span accordingly
	if [ $DAY_RM_THRESHOLD -lt 7 ]; then
		DAY_SPAN=1
	elif [ $DAY_RM_THRESHOLD -ge 7 ] && [ $DAY_RM_THRESHOLD -lt 28 ]; then
		DAY_SPAN=7
	elif [ $DAY_RM_THRESHOLD -ge 28 ] && [ $DAY_RM_THRESHOLD -lt $((28+30*11)) ]; then
		DAY_SPAN=30
	else
		DAY_SPAN=365
	fi

	#If the file's modification time is earlier than our date range we push it back one $DAY_SPAN
	if [ $MTIME -lt $(date -d "$(($DAY_RM_THRESHOLD+$DAY_SPAN)) days ago" +%s) ]; then
		DAY_RM_THRESHOLD=$(($DAY_RM_THRESHOLD+$DAY_SPAN))
		FOUND_ONE=1
	fi

	#Get date range
	DATE_END=$(date -d "$DAY_RM_THRESHOLD days ago" +%s)
	DATE_START=$(date -d "$(($DAY_RM_THRESHOLD+$DAY_SPAN)) days ago" +%s)

	#Have we found one to keep for this day?
	if [ $FOUND_ONE -eq 1 ] && [ $MTIME -ge $DATE_START ] && [ $MTIME -lt $DATE_END ]; then
		FOUND_ONE=0
	else
		rm -R "$FILE"
	fi
done
#done | xargs -d '\n' rm -R

echo "[INFO]: Cleaning of old files complete!"

exit 0
Thanks for all the help!

Last edited by gunnarflax; 02-02-2012 at 12:32 PM.
 
Old 02-02-2012, 05:48 PM   #30
Reuti
Senior Member
 
Registered: Dec 2004
Location: Marburg, Germany
Distribution: openSUSE 13.1
Posts: 1,320

Rep: Reputation: 252Reputation: 252Reputation: 252
Looks like it would be nice if find has an option to specify besides -depth something to sort the entries by time, either in each directory or overall.

NB: stat -c %Y "$FILE" (uppercase y) outputs the seconds directly.
 
  


Reply

Tags
bash, purge, smart, ubuntu


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Get file modification date/time in Bash script cmfarley19 Programming 12 01-19-2013 10:37 AM
Bash Script to Copy Modification Date from a file to his folder pjgm Programming 12 07-31-2011 09:33 AM
[SOLVED] copying files according to modification date and extension SriniKlr Linux - Newbie 5 01-03-2011 04:45 AM
[SOLVED] merge files by creation/modification date? andre.fm Linux - Newbie 5 10-04-2010 07:41 PM
copy folder/files according to modification date bkcreddy17 Programming 14 10-15-2008 08:24 PM


All times are GMT -5. The time now is 03:45 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration