LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (http://www.linuxquestions.org/questions/linux-software-2/)
-   -   Backup vs RAID 1 (http://www.linuxquestions.org/questions/linux-software-2/backup-vs-raid-1-a-443281/)

sixy 05-09-2006 09:24 PM

Backup vs RAID 1
 
I have a fedora 5 system that is running on a 250gb disk, partitoned as follows:
100mb ext2 /boot
<lvm>
4gb swap
20gb ext 3 /
~200gb xfs /home
</lvm>

there will be a large amount of data stored in /home, which has the potential to reach the 200gb limit and it is this data i am worried about. the actual system files can easily be replaced.

I have a second, identical, 250gb disk, and i want to make sure i do not lose that data if i can help it.

there are two options i have considered and these are:

1: raid1 array
for this i would probably reinstall the os (as there is as yet no data to be kept safe) and mirror the entire disk.
as budget constraints do not allow for a true hardware raid card, and as far as i know 2.6 kernels do not play nicely with on-board software raid controllers, this would be an mdm managed linux software raid setup.

2:regular scripted backups
the method i would probably use (unless you have a better) would be to freeze the /home filesystem, take a snapshot and unfreeze. then untar previous backup, update with rsync, and re-tar.

as far as i can see the advantages of the two options are:
1 produces least downtime as the entire system is protected against hardware failure, however no protection is offered against software/user problems (there is a good chance the end-users will delete a file the wish they hadnt. I can just say tough, but it would be nice not to)
2 offeres some protection agains software/user faults and also guards against hardware failure (although with more downtime involved) in the event of a system failure, as i would most likely run this script weekly due to high hardware demands, there is a higher chance very recent data will not be recovered. Also i am unsure if my desribed method would be possible if doing this operation on a 200gb folder given that the disk is only 250gb. does taking an lvm snapshot mean that that 200gb is stored twice, or is it stored once and linked to twice? also would the utar/rsync/tar be effective and work?

so there are the two options i have found. please rip them to shreds as much as possible, for this is a fairly critical operation.
also if there are other options that i havent thought of, please let me know.

many thanks for your help

haertig 05-09-2006 10:05 PM

RAID is not a backup solution, as you've noted. You are probably more likely to run into a user/software problem than a hardware problem. So I would go for the backup option over the RAID option, if I had to choose one over the other.

But if this data is "fairly critical" as you say, then you need both. If you can't afford downtime, and you seemed concerned about this, you need both. There's no magic bullet. If your data is critical, you need to spend some money to protect it.

Quote:

the method i would probably use (unless you have a better) would be to freeze the /home filesystem, take a snapshot and unfreeze.
I'm not sure you understand how a snapshot works. You take a snapshot, run your backup WHILE the snapshot is active, then delete the snapshot. A snapshot is not a seperate copy. A snapshot gives you static data to backup from, and while that is going one, new data is spooled off to a different area. When the snapshot is deleted, the new data that was stored in a seperate area is merged back with the main data. The snapshot you create just needs to be big enough to hold all the NEW data that is created while the backup is going on. For a low activity filesystem you could easily get by creating a snapshot that is only 5% or 10% of the filesystem size.

Quote:

then untar previous backup, update with rsync, and re-tar.
Never overwrite or alter your last known good backup. Create a seperate new backup. Only after you have created a new one, and verified its integrity, should you consider messing with the previous one. It's good practice to keep a few of the older backups around even though you have a never one available. How many, depends on your available disk space or offline storage capabilities.

If you have the disk space to store a tar file, and also to store it in its un-tarred state for rsync update, why tar it up in the first place? Investigate using rsync with the "link-dest" option instead (creates multiple apparent full backups). Depending on how fast your data changes and how much backup disk space you have, you could run your rsyncs quite frequently. Once a week sounds pretty weak for "fairly critical" data. Unless that data is mostly static.

chrism01 05-10-2006 12:29 AM

Without knowing more I'd say Haertig is on the right lines.

For Backups, use straight rsync, but ensure you've got the prev backup copied offline somewhere eg tape/CD before it runs.
Tha classic rule of thumb was 3 versions aka son/father/grandfather.

Another (business use) is:
1. daily rsync + daily offline for 1 week
2. keep 1 weekly backup eg Sat night
3. keep 1 monthly backup (last day of mth)
4. Repeat (3.) for quarterly/half-year if reqd
5. Keep last day of yr backup (this may be anuual yr or financial yr (UK = Apr 5) or both)

your choice.
As mentioned, mirrors are HW only, a deleted file is gone.... (ditto infection by worm/rootkit/virus etc)

sixy 05-10-2006 06:52 AM

Yeah i agree on the dont delete old backups issue - totally. However this in this particular instance, hardware budget is pretty low so if the client decieds to fill all the 200 gigs of allocated storage full of videos and photos, there is no easy way to store more than one on the 250gb spare drive, especially as mpegs/mp3s/jpegs dont compress terribly well. The clients do get a very cheap subscription option for us to store off-site backups, but i am not expecting much enthsiasm for this option and due to the obvious cost for us we can hardly be expected to do it for free.
The only reason the data is important is customers tend to get cross if you have to explain that they have forever lost thier wedding photos due to hard disk failure

thanks for the information on snapshots, however, i had only heard of the having stumbled across a little article somewhere on using xfs/lvm to do this, and like you said i had got the wrong end of the stick.
the only reason i am bothering at all is that occasionally clients will be running transcoding/ripping jobs overnight, and i dont want to interrupt these if i can help it but it looks as if thats not going to be a problem.

oh, by the way is there anyway of squashing 200 gigs of mpegs/mp3s/jpegs into ~120 gigs? (i think i know the answer before i ask... :( )

haertig 05-10-2006 09:43 AM

Quote:

Originally Posted by sixy
The only reason the data is important is customers tend to get cross if you have to explain that they have forever lost thier wedding photos due to hard disk failure

I think the client might get cross if some backup plan that they were depending on (and paid you to implement?), failed them. They would of course blame YOU!

I don't officially work with clients directly like you do, but if I did and had one who said "Do it cheap, I don't care if it's half-assed, just do it." then I'd respond with "Find someone else for the job." I can forsee a million and one nighmare scenerios with this particular client you are describing. Advise them if you must, but don't implement it yourself.

haertig 05-10-2006 09:45 AM

Quote:

Originally Posted by sixy
oh, by the way is there anyway of squashing 200 gigs of mpegs/mp3s/jpegs into ~120 gigs? (i think i know the answer before i ask... :( )

If you delete the porn mpegs/jpegs, the rest will probably all fit into 37Kb.

sixy 05-15-2006 12:46 PM

Quote:

Originally Posted by haertig
If you delete the porn mpegs/jpegs, the rest will probably all fit into 37Kb.

LOL! :D
'Many a true word is spoken in jest'


All times are GMT -5. The time now is 09:30 AM.