LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 03-08-2008, 01:23 PM   #1
Meson
Member
 
Registered: Oct 2007
Distribution: Arch x86_64
Posts: 606

Rep: Reputation: 66
Is my backup strategy overkill?


My computer is my laptop. I perform all backups using a script that I wrote that implements rsync.

Also in my house are a desktop, an external disk, and two usb flash drives.

On all 4 of the devices I keep a one-to-one copy of the files which I want backed up in a folder called "Today." I don't have any particular schedule, I just run it as I see fit.

On the desktop and external disk I also keep folders called yesterday, week, month, six_month, and year. When the year folder is over a year old, it updates itself from the six_month. When the six_month is over six months old, it updates from month. Etc.

Also, on all 4 devices there is a folder called TRASH. When running a backup to the today folder, if a file is changed or modified, the old version has the date appended to it (filename.ext__YYYYMMDDHHMMSS) and it is moved to the TRASH folder. When the rotations are taking place, the TRASH folder is NOT used.
Code:
/disk/user/today
/disk/user/yesterday/
/disk/user/week
/disk/user/month
/disk/user/six_month
/disk/user/year
/disk/user/TRASH

/desktop/user/today
/desktop/user/yesterday/
/desktop/user/week
/desktop/user/month
/desktop/user/six_month
/desktop/user/year
/desktop/user/TRASH

/usb1/user/today
/usb1/user/TRASH

/usb2/user/today
/usb2/user/TRASH
My thoughts were that the 4 separate devices will help mitigate any hardware failure. The folder rotations would help mitigate any file corruption, and also provide regular snapshots, which could be useful. And the TRASH folder would help for user error and provide someone of a versioning system.

However, it occurred to me that maybe having the TRASH folder AND the 5 snapshots isn't necessary. If there is a corrupted file on my laptop, will rsync view it as a change, thus storing the good version in the TRASH folder?

I think if I picked between the trash system and the rotations system, I would choose the trash system...

I'm aware that the major hole in my strategy right now is that all of these devices are located in the same building, but I'm going to get a remote setup running soon. Basically it would just be one more devices with the same type of directory structure.

Here is the script. It's pretty straight forward, all of the setup is at the top, the actual program is at the bottom. Sorry if it's a little messy but that's because of all the output and colors.
Code:
#!/bin/bash

fail()
{
	echo -en "\033[0;37;41m"
	for dev in ${autoDevs[@]}; do
		if umount $dev; then
			echo "Dismounted $dev"
		else
			echo "Failed to dismount $dev!!!"
		fi
	done
	echo -e "$1\033[0m\n"
	exit 1
}

case "$1" in
	<user1>)
		backups=(
			".mozilla"				0 "--exclude=Cache/"
		  	".mozilla-thunderbird"	0 ""
		  	".purple"				0 ""
		  	"Documents"				1 ""
		  	"Pictures"				1 ""
		  	"Scripts"				1 ""
		  	"Database.kdb"			1 ""
		  	".gnupg"				1 ""
		  	"Music"					2 ""
		  	"Videos"				2 "" )
		  	
		case "$2" in
			usb1 | usb)
				src_dev=
				src_dev_usr=
				src_mnt=
				src_mnt_usr=
				src=
				dest_dev=
				dest_dev_usr=
				dest_mnt=/media/usb1
				dest_mnt_usr=
				dest=$dest_mnt/<user1>
				trash=$dest/TRASH
				trash_ret=259200
				rotations=( )
				todo=( 0 1 2 3 4 5 6 )
			;;
		
			usb2)
				src_dev=
				src_dev_usr=
				src_mnt=
				src_mnt_usr=
				src=
				dest_dev=
				dest_dev_usr=
				dest_mnt=/media/usb2
				dest_mnt_usr=
				dest=$dest_mnt/<user1>
				trash=$dest/TRASH
				trash_ret=259200
				rotations=( )
				todo=( 0 1 2 3 4 5 6 7 )
			;;
		
			desktop)
				src_dev=
				src_dev_cred=
				src_mnt=
				src_mnt_usr=
				src=
				dest_dev=//<desktop>/<user1_share>
				dest_dev_cred=/root/<cred_file>
				dest_mnt=/mnt/desktop/<user1>
				dest_mnt_usr=<local_user>
				dest=$dest_mnt
				trash=$dest/TRASH
				trash_ret=259200
				rotations=( year 31536000 six_months
							six_months 15552000 month
							month 2592000 week
							week 604800 yesterday
							yesterday 86400 today )		    
				todo=( 0 1 2 3 4 5 6 7 8 9)
			;;
		
			disk)
				src_dev=
				src_dev_usr=
				src_mnt=
				src_mnt_usr=
				src=
				dest_dev=
				dest_dev_usr=
				dest_mnt=/media/disk
				dest_mnt_usr=
				dest=$dest_mnt/<user1>
				trash=$dest/TRASH
				trash_ret=259200
				rotations=( year 31536000 six_months
							six_months 15552000 month
							month 2592000 week
							week 604800 yesterday
							yesterday 86400 today )	    
				todo=( 0 1 2 3 4 5 6 7 8 9)
			;;
		
			*)
				fail "You need to enter a valid device!"
			;;
		esac
		
		if [ -z "$src" ]; then
			src=/home/<user1>
		fi
	;;
	
	<user2>)
		backups=(
			"Application Data/Mozilla"		0 "--exclude=Cache/"
			"Application Data/Thunderbird"	0 ""
			"My Documents"					1 "" )
			
		case $2 in
			disk)
				src_dev=//<desktop>/<user2_share>
				src_dev_cred=/root/<cred_file>
				src_mnt=/mnt/office/<user2>
				src_mnt_usr=<local_user>
				src=$src_mnt
				dest_dev=
				dest_dev_usr=
				dest_mnt=/media/disk
				dest_mnt_usr=
				dest=$dest_mnt/<user2>
				trash=$dest/TRASH
				trash_ret=259200
				rotations=( year 31536000 six_months
							six_months 15552000 month
							month 2592000 week
							week 604800 yesterday
							yesterday 86400 today )	
				todo=( 0 1 2 )
			;;
			
			laptop)
				src_dev=//<desktop>/<user2>
				src_dev_cred=/root/<cred_file>
				src_mnt=/mnt/office/<user2>
				src_mnt_usr=<local_user>
				src=$src_mnt
				dest_dev=
				dest_dev_cred=
				dest_mnt=
				dest_mnt_usr=
				dest=/home/<user2>
				trash=$dest/TRASH
				trash_ret=259200
				rotations=( )
				todo=( 0 1 2 )
			;;
			
			*)
				fail "You need to enter a valid device!"
			;;
		esac
	;;
	
	<user2>)
		backups=(
			"Application Data/Mozilla"		0 "--exclude=Cache/"
			"Application Data/Thunderbird"	0 ""
			"My Documents"					1 "" )
			
		case $2 in
			disk)
				src_dev=//<desktop>/<user2>
				src_dev_cred=/root/<cred_file>
				src_mnt=/mnt/office/<user2>
				src_mnt_usr=<local_user>
				src=$src_mnt
				dest_dev=
				dest_dev_usr=
				dest_mnt=/media/disk
				dest_mnt_usr=
				dest=$dest_mnt/<user2>
				trash=$dest/TRASH
				trash_ret=259200
				rotations=( year 31536000 six_months
							six_months 15552000 month
							month 2592000 week
							week 604800 yesterday
							yesterday 86400 today )	
				todo=( 0 1 2 )
			;;
			
			laptop)
				src_dev=//<desktop>/<user2>
				src_dev_cred=/root/<cred_file>
				src_mnt=/mnt/office/<user2>
				src_mnt_usr=<local_user>
				src=$src_mnt
				dest_dev=
				dest_dev_cred=
				dest_mnt=
				dest_mnt_usr=
				dest=/home/<user2>
				trash=$dest/TRASH
				trash_ret=259200
				rotations=( )
				todo=( 0 1 2 )
			;;
			
			*)
				fail "You need to enter a valid device!"
			;;
		esac
	;;
	
	*)
		fail "You need to enter a valid user"
	;;
esac

info="--progress --itemize-changes"
comp="--recursive --times --no-whole-file --modify-window=5 --exclude=Thumbs.db --exclude=Desktop.ini --exclude=desktop.ini --exclude=thumbs.db --exclude=*.lock"
rot="--delete-during --exclude=/Music/* --exclude=/Videos/* --exclude=/Downloads/* --exclude=/dat"
meth=(
	"$info $comp --delete-during"
	"$info $comp --delete-during --backup --backup-dir=$trash --suffix=__$(date +%Y%m%d%H%M%S)"
	"$info $comp"
	)

echo -e "\033[0;30;47mPerforming backup for user: $1 to location: $2\033[0m"

echo -e "\033[0;32m\nSETTING UP SOURCE\033[0m"
if [ -n "$src_mnt" ]; then
	if ! mount | grep $src_mnt; then
		if [ -n "$src_dev" ]; then
			if mount -t cifs -ouid=$src_mnt_usr,credentials=$src_dev_cred,file_mode=0400,dir_mode=0500 $src_dev $src_mnt; then
				echo -e "\033[0;34m\tAutomatically mounted $src_dev to $src_mnt\033[0m"
				autoDevs[0]=$src_mnt
			else
				fail "Failed to mount $src_dev to $src_mnt"
			fi
		else
			fail "No device given to automatically mount at $src_mnt"
		fi
	else
		echo -e "\033[0;34m\tIt seems $src_dev is already mounted to $src_mnt\033[0m"
	fi
else
	echo -e "\033[0;34m\tAssuming source $src is available.\033[0m"
fi

echo -e "\033[0;32m\nSETTING UP DESTINATION\033[0m"
if [ -n "$dest_mnt" ]; then
	if ! mount | grep $dest_mnt; then
		if [ -n "$dest_dev" ]; then
			if mount -t cifs -ouid=$dest_mnt_usr,credentials=$dest_dev_cred,file_mode=0600,dir_mode=0700 $dest_dev $dest_mnt; then
				echo -e "\033[0;34m\tAutomatically mounted $dest_dev to $dest_mnt\033[0m"
				autoDevs[1]=$dest_mnt
			else
				fail "Failed to mount $dest_dev to $dest_mnt"
			fi
		else
			fail "No device given to automatically mount at $dest_mnt"
		fi
	else
		echo -e "\033[0;34m\tIt seems $dest_dev is already mounted to $dest_mnt\033[0m"
	fi
else
	echo -e "\033[0;34m\tAssuming destination $dest is available.\033[0m"
fi

if [ -n "$trash" ]; then
	echo -e "\033[0;32m\nEMPTYING TRASH\033[0m"
	find $trash/* -amin +$trash_ret -cmin +$trash_ret -type f -delete -print
	find $trash/* -empty -type d -delete -print
fi

if (( ${#rotations[@]} > 0 )); then
	echo -e "\033[0;32m\nPERFORMING ROTATIONS\033[0m"
	for (( i = 0 ; i < ${#rotations[@]} ; i = i + 3 )); do
		let local age=`date +%s`-`stat --format=%Z $dest/${rotations[$i]}`
		if (( $age > ${rotations[$i+1]} )); then
			echo -e "\033[0;34m\tUpdating ${rotations[$i]} with ${rotations[$i+2]} because the age $age is greater than ${rotations[$i+1]}\033[0m"
			if ! rsync $info $comp $rot "$dest/${rotations[$i+2]}/" "$dest/${rotations[$i]}"; then
				fail "Something went wrong with the rotation: ${rotations[$i]}"
			fi
		else
			echo -e "\033[0;34m\tRotation ${rotations[$i]} with ${rotations[$i+2]} is up to date b/c age $age is less then than ${rotations[$i+1]}\033[0m"
		fi
	done
	
	dest=$dest/today
fi

echo -e "\033[0;32m\nPERFORMING BACKUPS\033[0m"
for i in ${todo[@]}; do
	echo -e "\033[0;34m\tBacking up ${backups[$i*3]}\033[0m"
	if ! rsync ${meth[${backups[$i*3+1]}]} ${backups[$i*3+2]} "$src/${backups[$i*3]}" "$dest"; then
		fail "Something went wrong with backup number $i*3: ${backups[$i*3]}"
	fi
done
touch $dest

echo -e "\033[0;32m\nUNMOUNTING DEVICES\033[0m"
for dev in ${autoDevs[@]}; do
	if umount $dev; then
		echo -e "\033[0;34m\tDismounted $dev sucessfully.\033[0m"
	else
		echo -e "\033[0;34m\tFailed to dismount $dev!!!\033[0m"
	fi
done

echo -e "\033[0;32m\nSUCCESS, HAVE A NICE DAY.\033[0m\n"

exit 0

Last edited by Meson; 03-08-2008 at 01:35 PM.
 
Old 03-08-2008, 02:46 PM   #2
okmyx
Member
 
Registered: May 2004
Location: Cornwall, UK
Distribution: Ubuntu 8.04
Posts: 464

Rep: Reputation: 31
Depends on what your backing up really.

If your a coder or developer then it would make sense.

If its pictures of your cat and the odd letter to your gran, then yes its overkill =)
 
Old 03-08-2008, 02:48 PM   #3
hob
Senior Member
 
Registered: Mar 2004
Location: Wales, UK
Distribution: Debian, Ubuntu
Posts: 1,075

Rep: Reputation: 45
FWIW, I realized a while ago that I can divide my personal files into three categories: those where I care about previous versions (because they are my own work), those where I don't (multimedia, mostly), and settings files. All my email is on a reliable IMAP service (Fastmail).

For the first type of file I currently use Subversion with the repositories on an off-site server, and keep a second set of working copies there as well. I'll be migrating those to Git shortly, which has strong integrity checks and doesn't require you to maintain a central repository. For the second type I just have rsync scripts so that I can keep copies on both my laptop and an external hard drive, to prevent a single drive failure from causing any losses.

I think that it pays to be a little paranoid. The hard drive on my laptop started to act oddly, and failed totally two boots later. Since I knew that I hadn't actually lost any data it wasn't a big deal.
 
Old 03-08-2008, 03:01 PM   #4
Meson
Member
 
Registered: Oct 2007
Distribution: Arch x86_64
Posts: 606

Original Poster
Rep: Reputation: 66
Ahhh, I forgot to say. I do have three categories. If you look at the backup definitions, the array backups. There are numbers associated with each folder, 0 1 and 2. 0 Is for my mozilla and pidgin profiles. I only want a 1 to 1 of each and I don't care about deletions. 1 is for a 1 to 1 in the today folder but deletions go to the TRASH folder. 2 is for large stuff like music and videos. Chances are I don't want to store EVERYTHING on my laptop, so stuff with the 2 backup isn't deleted on the backup location if it's removed from my laptop.

Also, the large stuff, like music and videos is only backed up to the "today" folder. It isn't rotated into the snapshot rotations.

You see, the rot variable specifies rsync commands for rotations. Large stuff that can be replaced is ignored.
Code:
rot="--delete-during --exclude=/Music/* --exclude=/Videos/* --exclude=/Downloads/* --exclude=/dat"
And the meth variable specifies the backup method (0 1 or 2). 0 Means just delete changes. 1 means delete changes but save them. 2 means don't do deletions.
Code:
meth=(
	"$info $comp --delete-during"
	"$info $comp --delete-during --backup --backup-dir=$trash --suffix=__$(date +%Y%m%d%H%M%S)"
	"$info $comp"
	)
 
Old 03-08-2008, 04:14 PM   #5
bigrigdriver
LQ Addict
 
Registered: Jul 2002
Location: East Centra Illinois, USA
Distribution: Debian Squeeze
Posts: 5,745

Rep: Reputation: 301Reputation: 301Reputation: 301Reputation: 301
Whether or not it's overkill depends on how important the files are to you. If they are very important, the farther back you can reach to find a file which isn't corrupt, the better for you.

Which brings up a question I have. How do you verify the files are not corrupt when you make the backup, or when you try to restore from a previous backup?
Do you just assume the files are not corrupt?
Do you just assume the files in the backups are not corrutp?
Do you do any kind of checksum comparison between a backup and and files on disk to verify they are good?

Files change over time. Some change because of edits, such as documents. Some change because of system updates. Some change because of file corruption. How do you protect yourself from unwanted changes (corruption)?

It wouldn't do much good to have multiple backups if file corruption is carried forward through each rotation.
 
Old 03-09-2008, 11:08 AM   #6
Meson
Member
 
Registered: Oct 2007
Distribution: Arch x86_64
Posts: 606

Original Poster
Rep: Reputation: 66
Quote:
Originally Posted by hob View Post
I'll be migrating those to Git shortly, which has strong integrity checks and doesn't require you to maintain a central repository.
Have you looked at Bazaar at all?
 
Old 03-09-2008, 02:46 PM   #7
hob
Senior Member
 
Registered: Mar 2004
Location: Wales, UK
Distribution: Debian, Ubuntu
Posts: 1,075

Rep: Reputation: 45
Quote:
Originally Posted by Meson View Post
Have you looked at Bazaar at all?
Only a little - one of the reasons that I've put off migrating off Subversion is that I don't understand the fundamental difference between Bazaar and Mercurial, and hoped that one would emerge as an obvious winner. I had discounted Git because I thought what a kernel developer considered usable might not work for me.

Three things got me off the fence:

- The Ruby community seems to have picked Git, and are moving in numbers.
- Watching the video of Linus' talk at Google about Git. He is very articulate and absolutely passionate about the issues, particularly data integrity.
- I tried it and was amazed: the interface of the current version is fine, the docs are good (!), and it is *very* fast. Like apt-get, it can be so quick that you don't quite believe that it did what you asked.
 
Old 03-09-2008, 05:14 PM   #8
Meson
Member
 
Registered: Oct 2007
Distribution: Arch x86_64
Posts: 606

Original Poster
Rep: Reputation: 66
Anyone that can talk for 70 minutes about there software on stage deserves to have it tried out.
 
Old 03-10-2008, 01:02 PM   #9
dracolich
Senior Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 1,173

Rep: Reputation: 47
Any strategy for backing up data is not overkill. Most people I talk to don't backup at all. They don't even save Word documents until they've finished it.
Quote:
I'm aware that the major hole in my strategy right now is that all of these devices are located in the same building, but I'm going to get a remote setup running soon.
The question should not be whether your plan is overkill, but whether or not the backed-up data is retrievable after a disaster. Am I right thinking that these backups are all stored on the same disk as the originals? That's the major hole that I see. Are you copying/moving the backups to an external medium at all? If the purpose of the backup is to have copies of important files in the event of a disaster, what if the disaster involves the entire contents of the disk being wiped or the disk itself physically failing?
 
Old 03-10-2008, 01:37 PM   #10
Meson
Member
 
Registered: Oct 2007
Distribution: Arch x86_64
Posts: 606

Original Poster
Rep: Reputation: 66
Quote:
Originally Posted by dracolich View Post
Am I right thinking that these backups are all stored on the same disk as the originals?
No, I think you you must have just missed that part of my explanations. There are actually 5 devices, including my main computer. My laptop, my parents' desktop, an external two-500GB raid-1 array, and two usb flash drives, one which I keep on my keychain and the other which I keep in a safe.

My laptop, and two usb drives have only the most up to date copy of my data, so there is 3. The deskop and external raid device each have a copy of the most uptodate as well as 5 old snapshots.

That's 15 copies of my data over 4 devices, 21 over 5 if you count each disk in the raid enclosure as a separate device.

Then in addition to that each device has a trash folder, when backups are made to the most uptodate folder, deletions are placed in the trash with a timestamp appended to the file name. My real question was whether people thought that this trash system was redundant when combined with the rotations system.

My backups are done by comparing file sizes and times. Error checking is run on my laptop once every 27 boots (Ubuntu default). I'm not sure about the desktop, and really never on either usb key. But since I'm assuming that corruption will not effect file size or times, that corruption on my laptop will not be pushed to the backups. Therefore the fact that I'm not backing up with a checksum comparison is actually providing a layer of protection. Is this a fair assumption?
 
Old 03-10-2008, 07:27 PM   #11
dracolich
Senior Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 1,173

Rep: Reputation: 47
Quote:
No, I think you you must have just missed that part of my explanations. There are actually 5 devices, including my main computer. My laptop, my parents' desktop, an external two-500GB raid-1 array, and two usb flash drives, one which I keep on my keychain and the other which I keep in a safe.

My laptop, and two usb drives have only the most up to date copy of my data, so there is 3. The deskop and external raid device each have a copy of the most uptodate as well as 5 old snapshots.

That's 15 copies of my data over 4 devices, 21 over 5 if you count each disk in the raid enclosure as a separate device.
At the time I was thinking each device was storing backups of it's own files only. My mistake.

I still don't think it's overkill. I was always taught that you can't have too many copies of a backup. That becomes even more true if the data is valuable and/or irreplacable. As you mention in your original post, keeping the older versions of files in TRASH while the others get updated gives you previous versions to fall back on if necessary. So, no I don't think it's redundant. Although at some point you may want to put an age limit on the files in TRASH. But that's entirely up to you.
 
Old 03-10-2008, 07:37 PM   #12
Meson
Member
 
Registered: Oct 2007
Distribution: Arch x86_64
Posts: 606

Original Poster
Rep: Reputation: 66
Quote:
Originally Posted by dracolich View Post
Although at some point you may want to put an age limit on the files in TRASH. But that's entirely up to you.
I do, it's roughly 6 months right now. Each entry has a variable called trash_ret which is defined in minutes (because that is the timescale that "find" uses.

Code:
find $trash/* -amin +$trash_ret -cmin +$trash_ret -type f -delete -print
find $trash/* -empty -type d -delete -print
 
  


Reply

Tags
git


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Backup strategy philwynk Linux - Newbie 8 01-26-2008 03:21 PM
Backup Strategy jadurant Linux - Newbie 1 06-05-2007 06:38 PM
Backup strategy DIL23 Linux - Newbie 4 03-10-2007 07:59 PM
Backup strategy xpucto Linux - Networking 2 11-16-2005 12:19 PM
Backup strategy Swift&Smart Linux - General 3 04-17-2003 03:07 AM


All times are GMT -5. The time now is 03:58 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration