Welcome to the most active Linux Forum on the web.
Go Back > Forums > Linux Forums > Linux - Software
User Name
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.


  Search this Thread
Old 10-27-2009, 12:43 PM   #1
LQ Newbie
Registered: Oct 2009
Posts: 10

Rep: Reputation: 0
Question Need Backup Solution Advice

Hello Board,

I'm a long-time lurker here and have been referred to LQ by Google for stuff more times than I can count. I signed up today to ask for advice on creating a simple backup solution for a server that contains a lot of data.

I've helped a friend put together a very nice system, Dell R710 w/ dual quad-core processors and 32GB of RAM. It's running CentOS 5.3 and is used to power a large government web site.

We need to backup approximately 1.5TB of data and prefer a daily incremental task and a weekly full backup task. I have two 2TB external drives that are attached to the server for this purpose.

I wanted to get some opinion from others as I've never had to backup such a large amount of data. I was thinking about using a simple rsync script for the daily incremental and just a full copy of /etc, /home, /var and /opt for the weekly full. Those are the only directories/partitions we're concerned about.

Any assistance in determining a solution is greatly appreciated.

Old 10-27-2009, 01:22 PM   #2
Registered: Sep 2005
Location: London, UK
Distribution: Debian
Posts: 258

Rep: Reputation: 38
What to backup depends on what data changes the most, and what's most valuable to you. Personally, I'd keep /etc under version control and use that to restore if I need a backup. I imagine the web site is somewhere under /var? Bear in mind there's probably a chunk of data in there, too, that you don't want taking up space on your backups.

I would suggest, though, that you don't keep your backups in the same room as the server. Theft or fire is likely to affect both of them. Keeping the backups online (or having them able to be brought online autonomously) also gives anyone who does break in the ability to hose them.
I'd advocate an off-site backup system, ideally one with some human interaction.
Old 10-27-2009, 02:47 PM   #3
Registered: Nov 2008
Location: Lower Saxony, Germany
Distribution: CentOS, RHEL, Solaris 10, AIX, HP-UX
Posts: 731

Rep: Reputation: 137Reputation: 137

intelligent backup requires to handle various restore scenarios. Rsync is a possibility to backup, but how does your restore look like?

There are dozens of questions you should ask yourself about your backup/recovery scenario.

- How large is your data change per day?
- Is there really a full backup required every week? or is one per month enough?
- How long should the backup be available for restore? Is point-in-time recovery required?
- Is disk backup really what you want or is a more reliable media required?
- Are there any legal issues to be resolved?
- How often can you test the restore?
- What about fire or floodwater?
- What about handling filesystem links?
- What about files deleted and added new having same filename?

First you should think about your strategy, this will be more difficult than finding a software solution. Plan your backup careful. After that go on the market and take a look what solution will fit your strategy.
Old 10-27-2009, 08:06 PM   #4
LQ Guru
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.7, Centos 5.10
Posts: 17,051

Rep: Reputation: 2261Reputation: 2261Reputation: 2261Reputation: 2261Reputation: 2261Reputation: 2261Reputation: 2261Reputation: 2261Reputation: 2261Reputation: 2261Reputation: 2261
I agree, strategy first (as above posts) then tactics ie tech solns.
rsync sounds good and if you give it a clean/empty target once a week, you'll get the full backup.
Consider several generations:

daily - incremental
weekly - full
mthly - save last weekly backup per mth
yrly - save last mthly backup per yr

Also consider whether those definitions of mth, yr will work or whether you want calendar mths/yrs.
Also, consider your country's end-of-financial-yr date; you may want/need a full backup as of the last day of the financial/tax yr.
Ask the Gov dept involved what their internal rules are; you need to fit in with them. They may even be governed by legislation.

You may want to consider duplicate copies of each backup in case one goes bad... stored eg one on-site for speed, one offsite for DR.
Check whether off-site should be encrypted, prob a good idea.
There's a famous quote about un-encrypted backups making it too easy for people to copy the data; can't find it right now though.
Old 10-27-2009, 08:08 PM   #5
Senior Member
Registered: Aug 2007
Location: Massachusetts, USA
Distribution: Solaris 9 & 10, Mac OS X, Ubuntu Server
Posts: 1,190

Rep: Reputation: 105Reputation: 105
rsync is one good approach, because it is efficient. However, the end result is a current copy of what's on your original drive. If you want to recover a configuration file you had yesterday, before you broke your system, then you may be out of luck. One solution to that is snapshots. And, there just happens to be a solution using rsync to create snapshot like backups.

The disadvantage of that approach is that a failure of the backup drive loses all of that. So, make it raid. Or make it mirror. Or make multiple copies of it in some way. If you have two drives, you could rsync one, take it off site, and rsync the other the next day. Then let it run with rsync snapshots for a week. Then swap the drives and let the rsync snapshot procedure create an updated snapshot on the first drive. Then let it continue running rsync snapshots for a week. Then swap again. At some point, you might run out of space. Then you could start pruning older snapshots.

Another alternative is to go with a tape library. (you said this is a large government site, right? so budget shouldn't be a complete road block.) I found a relatively inexpensive (as tape libraries go) one -- the Sony LIB162 AIT5. It has 16 tape slots. Each tape holds 400G native, and might compress to more than double that depending on your data. Because it is a carasoul changer, it is a simpler mechanism than most, and runs about $5K. If you start looking at LTO4 robots with typically 24 slots or more, the prices are typically $10K or more. But that's all just ball park. You then also have to budget for tapes. The advantage is that you can then have a cycle with nightly backups, tapes going back, say, 6 weeks or more, and off site archival tapes. I use Amanda to manage all that. Amanda has planner that works out dump strategies to smooth the backup over the entire dump cycle (say, a week), so that you don't have the huge resource hog of once a week full backups of everything, and then the backup system on semi idle the rest of the week just doing incrementals -- That's one of the main reasons I chose Amanda.

I actually like as much redundancy as I can manage. I have an external raid array that is managed by ZFS. It uses raidz2 (that's roughly equivalent to raid6) with a hot spare, so it has 9 data drives, 2 parity drives, and 1 hot spare. It would have to experience 4 drive failures to actually lose data. Using ZFS snapshots, I run a snapshot every night, and I keep those for the semester. In addition to that, I run a 6 week tape cycle, periodic archives, and cycle tapes off site. I also have some large radmind directories containing images that allow us to configure large numbers of lab and desktop computers easily and automatically. I use rsync to keep an up to date copy of that directory on another server in another building. I also have a cron daemon that does a remote copy of the Amanda configuration and index directories to a server in another building after the completion of each daily Amanda backup. So, gee, am I covered? hmm, I'm sure I can come up with something else I ought to be doing.

Just spend some time imagining what can go wrong. Then think about how you would recover from that. Then think some more.

If you're interested in digging deeper, check out the O'Reilly Backup and Recovery book, and/or take a look at the companion web site
Old 10-28-2009, 01:21 AM   #6
LQ Newbie
Registered: Oct 2009
Posts: 10

Original Poster
Rep: Reputation: 0
Thanks for all the great replies, you've given me a lot to think about. We already know what needs to be backed up, and when; I'm also aware of how much the data will grow over time. I'm just not certain about which route to take to implement the actual backups. I need speed and efficiency for the daily incremental and the weekly full.

I plan to rotate the external drives offsite like I do with tapes on other machines.


backup, drive, linux, usb

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
CentOS5 VPN solution advice sir-lancealot Linux - Security 1 02-01-2008 09:54 PM
Advice on software RAID solution mr_scary Linux - Server 2 02-18-2007 12:22 AM
Need advice on Backup solution imsam Linux - Enterprise 3 11-06-2004 12:07 PM
Backup solution Dannydy Linux - Newbie 5 10-12-2004 02:22 AM
Storeage Solution need advice. sarah_t_s Linux - Software 2 03-22-2004 11:51 PM

All times are GMT -5. The time now is 05:38 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration