Backup solution? Tar extraction is taking forever...
Linux - DesktopThis forum is for the discussion of all Linux Software used in a desktop context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Backup solution? Tar extraction is taking forever...
I am trying to settle on a backup solution. I would rather stick with command line. I like the idea of tar because it has been around forever, it's simple, and it's a standard. I have Acronis and Ghost, but want to get away from them and stick with command line (for scripting) and free (in both ways) of course. I am testing out tar extracting a simple file from a 340 Gig Tarred and gzipped archive. It's taking forever. I guess it's to be expected. I know Acronis and Ghost let you peruse archives and pull out files rather quickly. i would like the same option with being able to backup while the system is up and running as well. I know about Clonezilla and many of the live CD options, but I want to be able to have the system up and running. I don't want anything too complicated either. I only need it for a simple desktop backup. Imaging would be a nice plus, but I am not pushing it. Sorry so long. Any ideas?
I've got a bash script that runs everyday and backs up everything, excluding the virtual dir.'s while I'm asleep. It uses rsync and while not even close to 340 Gbytes it's all on a seperate drive and -no waiting- to get to a file. I save 3 of $HOME, everything else, and a list of the installed packages. Root folders are appended with the day of the year and the script does the math to see which ones to remove. Had issues with find but this works.
#!/bin/bash
#
# Backup routine to keep 3 sets
# of backup files.
#
# File Naming:
# os.[day of the year]
# barrie.[day of the year]
# packages.[day of the year]
#
#
# Variables
#
MYEMAIL=barrie@localhost
doy=$(date +%j)
declare -i nukem
declare -i middle
if [ $doy -le 3 ]; then
middle=3-$doy
nukem=366-$middle
else
nukem=$doy-3
fi
rsync -av --exclude-from=/usr/local/bin/eList / /media/backup/os.$doy
#
# Remove Thunar sessions
#
rm /home/barrie/.cache/sessions/Thunar*
#
# Empty the trash
#
rm -rf /home/barrie/.local/share/Trash/files
rm -rf /home/barrie/.local/share/Trash/info
mkdir /home/barrie/.local/share/Trash/files
mkdir /home/barrie/.local/share/Trash/info
chown -R barrie:barrie /home/barrie/.local/share/Trash
#
# Nuke the Nautilus sessions
# Leave 1 session files
#
# Specify the target directory and file names to operate on.
target_files=/home/barrie/.nautilus/saved-session-*
# Calculate the total number of files matching the target criteria.
total_files=$(ls -t1 $target_files | wc --lines)
# Specify the number of files to retain.
retained_files=1
# If there are surplus files, delete them.
if [ $total_files -gt $retained_files ]
then
rm $(ls -t1 $target_files | tail --lines=$((total_files-retained_files)))
fi
#
# Nuke the Metacity sessions
# Leave 1 session
#
# Specify the target directory and file names to operate on.
target_files=/home/barrie/.metacity/sessions/*
# Calculate the total number of files matching the target criteria.
total_files=$(ls -t1 $target_files | wc --lines)
# Specify the number of files to retain.
retained_files=1
# If there are surplus files, delete them.
if [ $total_files -gt $retained_files ]
then
rm $(ls -t1 $target_files | tail --lines=$((total_files-retained_files)))
fi
#
# Remove all the 'normal' thumbnails
#
rm /home/barrie/.thumbnails/normal/*
#
# Do the Backup
#
# Get the installed packages.
dpkg --get-selections | grep -v deinstall > /media/backup/packages.$doy
# Backup /home/barrie
rsync -av --exclude=/home/barrie/Music /home/barrie/ /media/backup/barrie.$doy
#
# Keep only 3 of each file set.
#
if [ -d /media/backup/os.$nukem ]
then
rm -rf /media/backup/os.$nukem
fi
if [ -d /media/backup/barrie.$nukem ]
then
rm -rf /media/backup/barrie.$nukem
fi
if [ -e /media/backup/packages.$nukem ]
then
rm /media/backup/packages.$nukem
fi
#
# Send me an email
#
#echo ""|\mutt -s "Backup Completed $doy" $MYEMAIL
exit 0
Last edited by barriehie; 12-07-2010 at 12:08 AM.
Reason: Oops, forgot the script.
Distribution: Solaris 9 & 10, Mac OS X, Ubuntu Server
Posts: 1,197
Rep:
I'm not surprised that a 340G tar takes a while. If you are just doing your one computer and want to keep it simple, command line, and free, then rsync might be an easy solution. There are quite a few backup programs or scripts with a few more bells and whistles that are based on rsync.
As far as tar goes, Amanda is often configured to use gnu tar; but, typically, you would break up what you are backing up into smaller pieces. This makes each piece easier to recover and also makes the backups more efficient. With Amanda's scheduler, you would typically end up running many of the individual backups in parallel. But, that's also typically across multiple drive spindles and machines, whereas you are only doing one. Still, breaking it up might make it easier.
Thanks for all of the responses. I took out all of my video files, eliminated gzip, and tried to extract a file without the starting "/" and it was fast. Not sure which event sped it up. Would the slash cause that? I know Tar mentions removing the slash when creating the archive. Or would it be the 152 Gig of MP4 files. Or maybe the compression... The MP4 files are DVD's I have anyways. I really don't need to back them up. I have the hard copies. Plus I have them on another ext hd that I plug into the xbox/ps3 to watch movies. Anyways. thanks for the interesting responses. I do use clonezilla on occasion for bare metal images. I will look more into the script more as well. I will also be using rsync for backing up to the other ext hd I just mentioned. Oh... and I have never heard of xz. bz2 seemed to take to long to back up. Plus, I am not hurting for space, so I have no reason to use compression at all. Would rather have the time back that it takes.
I would guess that videos are compressed enough without the additional gz. When you compress more of them together, gz probably is searching for where to optimize more, a lot of work considering the combined size of the files, but probably with little success
Backup the videos uncompressed instead, see if they take noticably more space now
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.