Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
My computer is my laptop. I perform all backups using a script that I wrote that implements rsync.
Also in my house are a desktop, an external disk, and two usb flash drives.
On all 4 of the devices I keep a one-to-one copy of the files which I want backed up in a folder called "Today." I don't have any particular schedule, I just run it as I see fit.
On the desktop and external disk I also keep folders called yesterday, week, month, six_month, and year. When the year folder is over a year old, it updates itself from the six_month. When the six_month is over six months old, it updates from month. Etc.
Also, on all 4 devices there is a folder called TRASH. When running a backup to the today folder, if a file is changed or modified, the old version has the date appended to it (filename.ext__YYYYMMDDHHMMSS) and it is moved to the TRASH folder. When the rotations are taking place, the TRASH folder is NOT used.
My thoughts were that the 4 separate devices will help mitigate any hardware failure. The folder rotations would help mitigate any file corruption, and also provide regular snapshots, which could be useful. And the TRASH folder would help for user error and provide someone of a versioning system.
However, it occurred to me that maybe having the TRASH folder AND the 5 snapshots isn't necessary. If there is a corrupted file on my laptop, will rsync view it as a change, thus storing the good version in the TRASH folder?
I think if I picked between the trash system and the rotations system, I would choose the trash system...
I'm aware that the major hole in my strategy right now is that all of these devices are located in the same building, but I'm going to get a remote setup running soon. Basically it would just be one more devices with the same type of directory structure.
Here is the script. It's pretty straight forward, all of the setup is at the top, the actual program is at the bottom. Sorry if it's a little messy but that's because of all the output and colors.
Code:
#!/bin/bash
fail()
{
echo -en "\033[0;37;41m"
for dev in ${autoDevs[@]}; do
if umount $dev; then
echo "Dismounted $dev"
else
echo "Failed to dismount $dev!!!"
fi
done
echo -e "$1\033[0m\n"
exit 1
}
case "$1" in
<user1>)
backups=(
".mozilla" 0 "--exclude=Cache/"
".mozilla-thunderbird" 0 ""
".purple" 0 ""
"Documents" 1 ""
"Pictures" 1 ""
"Scripts" 1 ""
"Database.kdb" 1 ""
".gnupg" 1 ""
"Music" 2 ""
"Videos" 2 "" )
case "$2" in
usb1 | usb)
src_dev=
src_dev_usr=
src_mnt=
src_mnt_usr=
src=
dest_dev=
dest_dev_usr=
dest_mnt=/media/usb1
dest_mnt_usr=
dest=$dest_mnt/<user1>
trash=$dest/TRASH
trash_ret=259200
rotations=( )
todo=( 0 1 2 3 4 5 6 )
;;
usb2)
src_dev=
src_dev_usr=
src_mnt=
src_mnt_usr=
src=
dest_dev=
dest_dev_usr=
dest_mnt=/media/usb2
dest_mnt_usr=
dest=$dest_mnt/<user1>
trash=$dest/TRASH
trash_ret=259200
rotations=( )
todo=( 0 1 2 3 4 5 6 7 )
;;
desktop)
src_dev=
src_dev_cred=
src_mnt=
src_mnt_usr=
src=
dest_dev=//<desktop>/<user1_share>
dest_dev_cred=/root/<cred_file>
dest_mnt=/mnt/desktop/<user1>
dest_mnt_usr=<local_user>
dest=$dest_mnt
trash=$dest/TRASH
trash_ret=259200
rotations=( year 31536000 six_months
six_months 15552000 month
month 2592000 week
week 604800 yesterday
yesterday 86400 today )
todo=( 0 1 2 3 4 5 6 7 8 9)
;;
disk)
src_dev=
src_dev_usr=
src_mnt=
src_mnt_usr=
src=
dest_dev=
dest_dev_usr=
dest_mnt=/media/disk
dest_mnt_usr=
dest=$dest_mnt/<user1>
trash=$dest/TRASH
trash_ret=259200
rotations=( year 31536000 six_months
six_months 15552000 month
month 2592000 week
week 604800 yesterday
yesterday 86400 today )
todo=( 0 1 2 3 4 5 6 7 8 9)
;;
*)
fail "You need to enter a valid device!"
;;
esac
if [ -z "$src" ]; then
src=/home/<user1>
fi
;;
<user2>)
backups=(
"Application Data/Mozilla" 0 "--exclude=Cache/"
"Application Data/Thunderbird" 0 ""
"My Documents" 1 "" )
case $2 in
disk)
src_dev=//<desktop>/<user2_share>
src_dev_cred=/root/<cred_file>
src_mnt=/mnt/office/<user2>
src_mnt_usr=<local_user>
src=$src_mnt
dest_dev=
dest_dev_usr=
dest_mnt=/media/disk
dest_mnt_usr=
dest=$dest_mnt/<user2>
trash=$dest/TRASH
trash_ret=259200
rotations=( year 31536000 six_months
six_months 15552000 month
month 2592000 week
week 604800 yesterday
yesterday 86400 today )
todo=( 0 1 2 )
;;
laptop)
src_dev=//<desktop>/<user2>
src_dev_cred=/root/<cred_file>
src_mnt=/mnt/office/<user2>
src_mnt_usr=<local_user>
src=$src_mnt
dest_dev=
dest_dev_cred=
dest_mnt=
dest_mnt_usr=
dest=/home/<user2>
trash=$dest/TRASH
trash_ret=259200
rotations=( )
todo=( 0 1 2 )
;;
*)
fail "You need to enter a valid device!"
;;
esac
;;
<user2>)
backups=(
"Application Data/Mozilla" 0 "--exclude=Cache/"
"Application Data/Thunderbird" 0 ""
"My Documents" 1 "" )
case $2 in
disk)
src_dev=//<desktop>/<user2>
src_dev_cred=/root/<cred_file>
src_mnt=/mnt/office/<user2>
src_mnt_usr=<local_user>
src=$src_mnt
dest_dev=
dest_dev_usr=
dest_mnt=/media/disk
dest_mnt_usr=
dest=$dest_mnt/<user2>
trash=$dest/TRASH
trash_ret=259200
rotations=( year 31536000 six_months
six_months 15552000 month
month 2592000 week
week 604800 yesterday
yesterday 86400 today )
todo=( 0 1 2 )
;;
laptop)
src_dev=//<desktop>/<user2>
src_dev_cred=/root/<cred_file>
src_mnt=/mnt/office/<user2>
src_mnt_usr=<local_user>
src=$src_mnt
dest_dev=
dest_dev_cred=
dest_mnt=
dest_mnt_usr=
dest=/home/<user2>
trash=$dest/TRASH
trash_ret=259200
rotations=( )
todo=( 0 1 2 )
;;
*)
fail "You need to enter a valid device!"
;;
esac
;;
*)
fail "You need to enter a valid user"
;;
esac
info="--progress --itemize-changes"
comp="--recursive --times --no-whole-file --modify-window=5 --exclude=Thumbs.db --exclude=Desktop.ini --exclude=desktop.ini --exclude=thumbs.db --exclude=*.lock"
rot="--delete-during --exclude=/Music/* --exclude=/Videos/* --exclude=/Downloads/* --exclude=/dat"
meth=(
"$info $comp --delete-during"
"$info $comp --delete-during --backup --backup-dir=$trash --suffix=__$(date +%Y%m%d%H%M%S)"
"$info $comp"
)
echo -e "\033[0;30;47mPerforming backup for user: $1 to location: $2\033[0m"
echo -e "\033[0;32m\nSETTING UP SOURCE\033[0m"
if [ -n "$src_mnt" ]; then
if ! mount | grep $src_mnt; then
if [ -n "$src_dev" ]; then
if mount -t cifs -ouid=$src_mnt_usr,credentials=$src_dev_cred,file_mode=0400,dir_mode=0500 $src_dev $src_mnt; then
echo -e "\033[0;34m\tAutomatically mounted $src_dev to $src_mnt\033[0m"
autoDevs[0]=$src_mnt
else
fail "Failed to mount $src_dev to $src_mnt"
fi
else
fail "No device given to automatically mount at $src_mnt"
fi
else
echo -e "\033[0;34m\tIt seems $src_dev is already mounted to $src_mnt\033[0m"
fi
else
echo -e "\033[0;34m\tAssuming source $src is available.\033[0m"
fi
echo -e "\033[0;32m\nSETTING UP DESTINATION\033[0m"
if [ -n "$dest_mnt" ]; then
if ! mount | grep $dest_mnt; then
if [ -n "$dest_dev" ]; then
if mount -t cifs -ouid=$dest_mnt_usr,credentials=$dest_dev_cred,file_mode=0600,dir_mode=0700 $dest_dev $dest_mnt; then
echo -e "\033[0;34m\tAutomatically mounted $dest_dev to $dest_mnt\033[0m"
autoDevs[1]=$dest_mnt
else
fail "Failed to mount $dest_dev to $dest_mnt"
fi
else
fail "No device given to automatically mount at $dest_mnt"
fi
else
echo -e "\033[0;34m\tIt seems $dest_dev is already mounted to $dest_mnt\033[0m"
fi
else
echo -e "\033[0;34m\tAssuming destination $dest is available.\033[0m"
fi
if [ -n "$trash" ]; then
echo -e "\033[0;32m\nEMPTYING TRASH\033[0m"
find $trash/* -amin +$trash_ret -cmin +$trash_ret -type f -delete -print
find $trash/* -empty -type d -delete -print
fi
if (( ${#rotations[@]} > 0 )); then
echo -e "\033[0;32m\nPERFORMING ROTATIONS\033[0m"
for (( i = 0 ; i < ${#rotations[@]} ; i = i + 3 )); do
let local age=`date +%s`-`stat --format=%Z $dest/${rotations[$i]}`
if (( $age > ${rotations[$i+1]} )); then
echo -e "\033[0;34m\tUpdating ${rotations[$i]} with ${rotations[$i+2]} because the age $age is greater than ${rotations[$i+1]}\033[0m"
if ! rsync $info $comp $rot "$dest/${rotations[$i+2]}/" "$dest/${rotations[$i]}"; then
fail "Something went wrong with the rotation: ${rotations[$i]}"
fi
else
echo -e "\033[0;34m\tRotation ${rotations[$i]} with ${rotations[$i+2]} is up to date b/c age $age is less then than ${rotations[$i+1]}\033[0m"
fi
done
dest=$dest/today
fi
echo -e "\033[0;32m\nPERFORMING BACKUPS\033[0m"
for i in ${todo[@]}; do
echo -e "\033[0;34m\tBacking up ${backups[$i*3]}\033[0m"
if ! rsync ${meth[${backups[$i*3+1]}]} ${backups[$i*3+2]} "$src/${backups[$i*3]}" "$dest"; then
fail "Something went wrong with backup number $i*3: ${backups[$i*3]}"
fi
done
touch $dest
echo -e "\033[0;32m\nUNMOUNTING DEVICES\033[0m"
for dev in ${autoDevs[@]}; do
if umount $dev; then
echo -e "\033[0;34m\tDismounted $dev sucessfully.\033[0m"
else
echo -e "\033[0;34m\tFailed to dismount $dev!!!\033[0m"
fi
done
echo -e "\033[0;32m\nSUCCESS, HAVE A NICE DAY.\033[0m\n"
exit 0
FWIW, I realized a while ago that I can divide my personal files into three categories: those where I care about previous versions (because they are my own work), those where I don't (multimedia, mostly), and settings files. All my email is on a reliable IMAP service (Fastmail).
For the first type of file I currently use Subversion with the repositories on an off-site server, and keep a second set of working copies there as well. I'll be migrating those to Git shortly, which has strong integrity checks and doesn't require you to maintain a central repository. For the second type I just have rsync scripts so that I can keep copies on both my laptop and an external hard drive, to prevent a single drive failure from causing any losses.
I think that it pays to be a little paranoid. The hard drive on my laptop started to act oddly, and failed totally two boots later. Since I knew that I hadn't actually lost any data it wasn't a big deal.
Ahhh, I forgot to say. I do have three categories. If you look at the backup definitions, the array backups. There are numbers associated with each folder, 0 1 and 2. 0 Is for my mozilla and pidgin profiles. I only want a 1 to 1 of each and I don't care about deletions. 1 is for a 1 to 1 in the today folder but deletions go to the TRASH folder. 2 is for large stuff like music and videos. Chances are I don't want to store EVERYTHING on my laptop, so stuff with the 2 backup isn't deleted on the backup location if it's removed from my laptop.
Also, the large stuff, like music and videos is only backed up to the "today" folder. It isn't rotated into the snapshot rotations.
You see, the rot variable specifies rsync commands for rotations. Large stuff that can be replaced is ignored.
And the meth variable specifies the backup method (0 1 or 2). 0 Means just delete changes. 1 means delete changes but save them. 2 means don't do deletions.
Whether or not it's overkill depends on how important the files are to you. If they are very important, the farther back you can reach to find a file which isn't corrupt, the better for you.
Which brings up a question I have. How do you verify the files are not corrupt when you make the backup, or when you try to restore from a previous backup?
Do you just assume the files are not corrupt?
Do you just assume the files in the backups are not corrutp?
Do you do any kind of checksum comparison between a backup and and files on disk to verify they are good?
Files change over time. Some change because of edits, such as documents. Some change because of system updates. Some change because of file corruption. How do you protect yourself from unwanted changes (corruption)?
It wouldn't do much good to have multiple backups if file corruption is carried forward through each rotation.
Only a little - one of the reasons that I've put off migrating off Subversion is that I don't understand the fundamental difference between Bazaar and Mercurial, and hoped that one would emerge as an obvious winner. I had discounted Git because I thought what a kernel developer considered usable might not work for me.
Three things got me off the fence:
- The Ruby community seems to have picked Git, and are moving in numbers.
- Watching the video of Linus' talk at Google about Git. He is very articulate and absolutely passionate about the issues, particularly data integrity.
- I tried it and was amazed: the interface of the current version is fine, the docs are good (!), and it is *very* fast. Like apt-get, it can be so quick that you don't quite believe that it did what you asked.
Any strategy for backing up data is not overkill. Most people I talk to don't backup at all. They don't even save Word documents until they've finished it.
Quote:
I'm aware that the major hole in my strategy right now is that all of these devices are located in the same building, but I'm going to get a remote setup running soon.
The question should not be whether your plan is overkill, but whether or not the backed-up data is retrievable after a disaster. Am I right thinking that these backups are all stored on the same disk as the originals? That's the major hole that I see. Are you copying/moving the backups to an external medium at all? If the purpose of the backup is to have copies of important files in the event of a disaster, what if the disaster involves the entire contents of the disk being wiped or the disk itself physically failing?
Am I right thinking that these backups are all stored on the same disk as the originals?
No, I think you you must have just missed that part of my explanations. There are actually 5 devices, including my main computer. My laptop, my parents' desktop, an external two-500GB raid-1 array, and two usb flash drives, one which I keep on my keychain and the other which I keep in a safe.
My laptop, and two usb drives have only the most up to date copy of my data, so there is 3. The deskop and external raid device each have a copy of the most uptodate as well as 5 old snapshots.
That's 15 copies of my data over 4 devices, 21 over 5 if you count each disk in the raid enclosure as a separate device.
Then in addition to that each device has a trash folder, when backups are made to the most uptodate folder, deletions are placed in the trash with a timestamp appended to the file name. My real question was whether people thought that this trash system was redundant when combined with the rotations system.
My backups are done by comparing file sizes and times. Error checking is run on my laptop once every 27 boots (Ubuntu default). I'm not sure about the desktop, and really never on either usb key. But since I'm assuming that corruption will not effect file size or times, that corruption on my laptop will not be pushed to the backups. Therefore the fact that I'm not backing up with a checksum comparison is actually providing a layer of protection. Is this a fair assumption?
No, I think you you must have just missed that part of my explanations. There are actually 5 devices, including my main computer. My laptop, my parents' desktop, an external two-500GB raid-1 array, and two usb flash drives, one which I keep on my keychain and the other which I keep in a safe.
My laptop, and two usb drives have only the most up to date copy of my data, so there is 3. The deskop and external raid device each have a copy of the most uptodate as well as 5 old snapshots.
That's 15 copies of my data over 4 devices, 21 over 5 if you count each disk in the raid enclosure as a separate device.
At the time I was thinking each device was storing backups of it's own files only. My mistake.
I still don't think it's overkill. I was always taught that you can't have too many copies of a backup. That becomes even more true if the data is valuable and/or irreplacable. As you mention in your original post, keeping the older versions of files in TRASH while the others get updated gives you previous versions to fall back on if necessary. So, no I don't think it's redundant. Although at some point you may want to put an age limit on the files in TRASH. But that's entirely up to you.
Although at some point you may want to put an age limit on the files in TRASH. But that's entirely up to you.
I do, it's roughly 6 months right now. Each entry has a variable called trash_ret which is defined in minutes (because that is the timescale that "find" uses.
Code:
find $trash/* -amin +$trash_ret -cmin +$trash_ret -type f -delete -print
find $trash/* -empty -type d -delete -print
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.