LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (http://www.linuxquestions.org/questions/linux-server-73/)
-   -   Automated backup and archive on removable SATA drives, partitioning and burner share (http://www.linuxquestions.org/questions/linux-server-73/automated-backup-and-archive-on-removable-sata-drives-partitioning-and-burner-share-706338/)

vaf 02-20-2009 10:22 PM

Automated backup and archive on removable SATA drives, partitioning and burner share
 
Hi all

I am a pretty basic user, used KDE and Gnome, but I am more familiar with Mac and PC - forgive me, I'm trying my best to switch!

I have built a machine I want to use as a file server - it has 2 internal hardware RAID 1 1Tb drives and 4 SATA removable slots. I have installed Ubuntu Server 8.10 and setup the \Server directory as a share using SAMBA, which works perfectly fine. The RAID 1 is essential for data reliability and security.

I have installed two 1Tb SATA drives in the removable slots and mounted them as /Backup and /Archive. I would like to have the system do the following
  • Every hour COPY (backup) any file in the /Server directory that has changed since the last backup to the /Backup directory under a folder structure that is built upon the backup time
  • Every month any file in the /Server directory that hasn't been ACCESSED within the past 6 months be MOVED to the /Archive directory

The Backup drive is to be removed every night and taken offsite as an additional security measure, so it needs to be able to be hot swapped (pretty sure this is okay, but obviously when drive isn't in the automated backup needs to keep a 'list' of what to backup when the drive returns (ie if /backup not mounted, then accumulate list of changed files and copy those files on next backup along with other changes))

The Archive drive should accumulate files and when a critical percentage of files is reached (say 90% then an email or some sort of notifier should be sent, then when the drive is full the archive process/application will be suspended and notification sent. Then a new drive will be put in its place (mounted in /Archive) - what partitioning app should i use for this?

The /Backup and /Archive directories are read-only shares on SAMBA.

I haven't installed any GUI, as I was hoping to keep processor usage to a minimum, but I'm happy to if needed. (which one, and which version (is -core enough?)

I know that it is a very specific ask, but I'm sure it is a common way to setup a file server. The data needs to be 100% reliable (hence the setup) whilst security is not anywhere near as important (so no encryption needed) as it is mainly for photos. I would really appreciate any help on this, as I want to get it right! Please help me with:
  • applications (preferably non-GUI based) to get me what I want
  • partitioning application
  • suggestions/alternatives/comments/criticisms on this setup

Oh and I have installed a DVD-DL burner in the server - is it possible to share the drive so that connected WinXP machines can use the drive as a burner - I can only figure out how to mount the drive as a shared folder, rather than as a device!

Thanks :D

jschiwal 02-20-2009 11:54 PM

To use the device as a burner, I think you will want to install cygwin/X on the XP machine. Then ssh into the server and pull the files you want burned from a share on the XP machine.

If you have the basic X install, you can run a program like k3b run on the server but display it's window on the XP machine. The X server doesn't even need to be running on the server. The application is the X client, and cygwin/X's X terminal is the server.

---

For syncronizing from Server to Backup, you can use rsync. Another option is to use find to locate changed and new files. Yet another option is to use tar with the -g <timestamp>.snar option to only backup files that are more recent than the last backup.

Look in the tar info manual; section 5.2 on incremental backups.

If you want the files copied instead of a tarball, you can do that to, by piping the output of tar to another tar command:

makedir "$DATE"
tar -C /Server -g "$DATE".snar -cf - | tar -C /Backup/"$DATE" -xvf - >logfile

A similar command is given in the tar info manual as well.

Suppose that you wanted to copy new files to /Backup and create a tar archive in /Archive:
tar -C /Server -g "$DATE".snar -cf - | tee /Archive/"$DATE".tar.gz | tar -C /Backup/"$DATE" -xvf - >logfile

If you use public key authentication with ssh; load in your passphrase using ssh-agent; you can even use this to transfer a partition full of files on a computer across the country securely.

tar -C /Server -cf - | ssh username@hostname tar -C /path/to/base/dir -xvf - >logfile

vaf 02-21-2009 01:16 AM

Thanks jschiwal, i've got some reading to do!

Quote:

Originally Posted by jschiwal (Post 3451909)
For syncronizing from Server to Backup, you can use rsync. Another option is to use find to locate changed and new files. Yet another option is to use tar with the -g <timestamp>.snar option to only backup files that are more recent than the last backup.

I'll read up on rsync and find, but I like the sound of tar.

Does the -g option take into account modified files or just new files?

Quote:

Originally Posted by jschiwal (Post 3451909)
Look in the tar info manual; section 5.2 on incremental backups.

If you want the files copied instead of a tarball, you can do that to, by piping the output of tar to another tar command:

makedir "$DATE"
tar -C /Server -g "$DATE".snar -cf - | tar -C /Backup/"$DATE" -xvf - >logfile

Do I then need to delete the tarball once the extraction has happened?


Should I also be using cron to schedule the running of these as hourly and monthly events?

Also, how do I setup the notification for when a drive is at a certain capacity?


Thanks

jschiwal 02-21-2009 10:40 AM

Except for the example with the tee command, the tar archive just exists in the stream. The tar command on left hand side sends the archive out stdout. The tar command on the right hand side receives the stream from stdin and extracts it.

The tee command in the second example copies stdin to stdout and to the file. This allows you to simultaneously send files to the target and save a regular archive somewhere else.

Before creating a script to be run by cron, run it manually. You will need to tweak which directories should be backed up, which ones shouldn't be, and which ones don't need to be. For example, don't back up /proc, /sys, /mnt, /media. backing up /tmp would be a waste.

The tar piping examples work better between two machines if you use public key ssh authentication. Unfortunately, If you are copying files to a clone system, you may need to allow root logins or backup operator logins, so thay you have permissions to write to the target directories.

Look at the ssh-agent command to store the passphrase, if you don't mind having to restart the computer manually to be able to use the private key.

If you write a cron script, make sure you use the full path to commands and eliminate print operations or redirect console output. This could cause the cron job to fail because cron doesn't have an attached console.

vaf 02-21-2009 04:36 PM

Thanks jschiwal.

I am only going to Backup and Archive the /Server directory. And it is always on the same machine - no network backup at the moment.

I have also been investigating rsync.

My main concerns at the moment are:
  • When the /Backup directory is not present (ie the drive has been removed for the night) the backup script shouldn't run. How do I do this? (some sort of IF statement??)
  • How do I setup the capacity notification for the drives?
  • What partitiioning program should i use?

Thanks again, and I'll keep reading the manuals! But any help would be greatly appreciated!

jschiwal 02-21-2009 05:16 PM

Some distro's have a partitioning program that lets you resize a partition. You may need to run the installer or rescue disk up to the partitioning phase (and no more) if it is a partition with a system directory mounted on it. Gparted is a good partitioning program that is similar to partition magic. However if you are going to resize the raid array, I'm not certain if it has the smarts to do it.

If this is a new installation, maybe reinstalling would be the easiest route now that you have a better idea how you want it partitioned.

For the question on detecting whether the removable sata drive is connected, you might check for it's serial number in the sysfs or /disk/by-id/ hierarchy.
Code:

udevinfo -q all -n /dev/sdc
the program '/bin/bash' called 'udevinfo', it should use 'udevadm info <options>', this will stop working in a future release
P: /devices/pci0000:00/0000:00:1c.5/0000:20:00.0/fw-host0/0090a950ce3305e6/0090a950ce3305e6-0/host8/target8:0:0/8:0:0:0/block/sdc
N: sdc
S: disk/by-id/ieee1394-0090a950ce3305e6:000434:0000
S: disk/by-path/pci-0000:20:00.0-ieee1394-0x0090a950ce3305e6:000434:0000
E: ID_PATH=pci-0000:20:00.0-ieee1394-0x0090a950ce3305e6:000434:0000

example test:
[ -b /dev/disk/by-id/ieee1394-0090a950ce3305e6\:000434\:0000 ] && echo attached || echo not attached
attached

So if I wanted to detect the drive, I could use the [ -b ... ] test above. The main point is to use a unique identifier to detect the disk.

vaf 02-22-2009 03:36 AM

Sorry, maybe I didn't make my points clear enough.

I need to know how to format a new drive and how to mount it. During the install, the partitioning program did this, but i am thinking ahead to when i need to replace the archive drive. I am not going to re-partition any of the drives.


When the /Backup directory is not present (ie the drive has been removed for the night) the backup script shouldn't run. How do I do this? (some sort of IF statement??)


And finally what/how do I setup the capacity notification for the drives?

Thanks again


All times are GMT -5. The time now is 06:32 PM.