Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I'd like advice on a basic backup strategy and the necessary hardware. I'm looking for a solution that I can implement myself, that can be configured to be done wihout my intervention preferably.
It woulld have to be a local solution as I don't have enough bandwidth to spare for a remote option, in my opinion at least.
MY main backup concerns are:
- Websites (along with permisions)
- conf files for important daemons
- a few select homedirs
(open to suggestiosn here also)
Of course I'd like to backup to an external media, for which too i'd like reccomendations. I have no problem with doign the neccessary foot work; i'm just hoping to draw on any possible experience here.
First off, tape is your best choice. Some may use CDs or network backups, but you just can't beat the speed and capacity of tapes. And depending on which type of drive/tapes you chose, the media isn't too expensive.
There is backup software out there for Linux, but personally I just use a script that cron runs every day. The script just uses tar to backup everything I want to the tape streamer (/dev/st0) and then save a log file in my home directory. I also have a cron job that erases and retentions the tape every month or so. This is just something I like to do, it isn't really required (at least I don't think so).
I do daily backups of my directories that change a lot, and then a monthly for the entire system. Of course your schedules is going to depend on what your server is doing.
As for what tape/drive to use, that is really dependent on how much you want to spend, and how much you need to store.
I would respectfully disagree that tape is the way to go. Storage is so cheap these days that's it worth buying a second HD simply for the purpose of backing up the first one is cost-effective. I agree with MS3FGX's comments in general, but would apply them to a backup HD as opposed to a tape. As always, just my 2 cents - J.W.
I would also agree with the second hard drive option. I use a spare hard drive and backup my /home on an hourly basis with rsync. The spare drive is only mounted during the backup procedure and would be fairly well protected against hardware problems should they occur routinely.
IMHO it depends on the situation, how much data you need to backup and how important is the data you backup. If you rely on a second hard drive which is always installed you risk the possibility of a hardware failure which could damage the drives electronics. You could retrieve the data from a data recovery service which isn't cheap. Keeping the drive unmounted will minimize file corruption if the OS crashes. Using multiple spare drives with a removable chassis would provide the some security incase of a castrophic hardware problem.
If your running a business a tape drive makes good sense. I personally use a tape drive buts thats just because I have the capability.
With whatever media you end up using make sure you stick to the backup plan and verify that you can actually restore on a regular basis.
Here are the the levels of redundancy and backup for our systems that run MySQL, J-Tomcat 5.0.19, Samba, SSH, SSL, in case of loss of data due to physical & software failure (note: this might be overkill for some... )
1. software RAID 1
2. CRON jobs (tar, mysqldump -- weekly)
3. Heartbeat (Linux-High Availability) for redundant services / IP faking / failover
4. Rsync every minute between local and remote locations over SSL
5. battery backups
6. scripts for burning CDR / DVD-Rs via one-click by end user for all app info (db, custom files)
it's almost fully automatic except for the manual backups to CDR / DVD-R.
First of for all the lovely input. I'd like to respond one message at a time.
MS3FGX: I like that idea, esp. as it involves media which can be 'seperated' from the linux box. However, these drives seem expensive. I've been looking at a reconditioned Certance's TapeStor Travan 20 ATAPI tape backup, but I have not been able to find out if it's Linux compatible.
J.W.: Your approach seems more on the affordable side, at least in my igonrance. But how would you rate the feasibilty of this with an external drive. I really woudl rather have this outside the box.
michaelk: I suppose it's not a 'mission critical' system....but I figure that this is critical experiences i should gather before I have some actual business's very important data to handle. For now I'd say that the total backup size shoudl easily be under a gig. But things change, and I'd like said system to last awhile.
jsokko: love the approach. wish I had al those resources. Where can I take a look at such scripts for burning CDR/CDRW? On the Windows side I know there was a UDF formattign for CDRW's which allowed you to treat them as regular disks: does such functionality exist in linux yet? Also, any suggested reading on the topic of Heartbeat? I've never heard of it before.
I personally don't use a second HDD for a few reasons:
Hard disk drives degrade and have failures, if they didn't you wouldn't be making the backups in the first place. Tapes on the other hand are a hearty media. You can drop them, get them wet, pile them all in a box, and still be fairly sure they are still going to be good. Longevity is also good, though they should be replaced every few years. Though optical media is certainly the most reliable when it comes to long term storage.
A second HDD must remain in the machine. While a HDD rack helps this situation (I use them) unless it is a hotswap rack you need to power down the server just to remove your backup. In a mission critical situation (only DNS server for a busy network for instance) this simply cannot happen every day.
3. Risk of damage
It is possible that some sort of hardware failure or physical damage is what damaged your primary drive (power surge, overheat, impact, etc). In the case of a HDD backup system, the damage that the first drive received could very well happen to your second drive, resulting in a complete loss of data.
A HDD is much larger than any other backup medium. If you are doing a different media for every day (and you should) you are going to be stuck with a pile of HDDs. Certainly not space efficient. Some may respond that you could partition the single drive with 7 partitions (one per day) and use only one drive, but not only would that cut your usable storage capacity to 1/7 of it's original size, there is also the risk of a hardware failure on the drive resulting in a loss of a weeks worth of data, instead of a day.
While the drives may be pretty expensive, the tapes are not. Conversely, there is no "drive" required for a HDD backup system (except for racks if you use them) but the HDD itself is expensive. For instance, you can get a 260 GB tape for $60.00, while a 250 GB HDD is $230. When you are buying 7 at a time, that is a massive difference in price.
Now, having said that, a USB external drive could be good low-cost backup device for a personal server. Not something I would run, but that is a personal choice of mine. I once had a corporate server get destroyed on me, and there was nothing I could do since the administration didn't want to spend the money on a reliable backup system when it was first installed, so I had to completely rebuild the machine. After that, I refuse to work on servers that don't have some sort of removable backup media.
As for the drive, I can personally vouch for the Travan drive, as it is what I use. Though mine is made by HP.
TigerDirect has that very same drive you are looking at (well, brand and name anyway, I don't know what capacity you are looking at) and it does say it is Linux compatible, and they have it for $380.00. That drive and a few tapes is going to be cheaper than enough HDDs to have a complete backup schedule (unless you are going for lower capacity).
In the end though, it is really up to you to look into the different options and decide the best option for your situation.
After reading your posts a few more times, I am going to suggest a smaller capacity (and cheaper) drive.
I would look for the Sony DDS-2 drive. With a maximum compressed capacity of 8GB (4GB uncompressed) and tapes that are only $5.00, it is a great choice for low capacity storage.
However, this drive is outdated by a few versions, and as such may be harder to find. I have seen them online for around $400, but that seems a bit expensive. Your best shot it so find either a reconditioned DDS-2 drive or check out Ebay, where I have seen them go for around $100 - $150.
MS3FGX: My humble apologies. TigerDirect is where I saw the drive. (I secretly hoped that I was one of the few that knew about the low prices at TigerDirect) The thing is I had looked more at the specs presented in point form and not at the briefing on it so I missed the part about Linux.
Thanks. I personally seem to more associate tape with backup by nature, and providing I can get a good price I'll be sure to take tape drive.
Now, any points of interest, suggestions or reading material on the actual backing up? As in if to just tar or whatever the case maybe. Or should I just wait till I have a drive in hand before I start worring about that?
Well is this for a production business system or just for your own personal use, and how much data are we talking about? For a basic home system, I would say that the easiest way to do a backup is to install a second hard drive inside your box, and simply copy over the data from the main drive to the backup drive periodically. Similarly, you could tar up your data then burn it to a CD if it's small enough. If you need to restore something in either of those scenarios, it would take all of about a minute or two.
On the other hand, if it's a company's website + database and you are dealing with massive amounts of data, then Yes, tape is a good way to go. Note however that tape is not foolproof, and I would not consider tape to be an end-all, be-all solution. Sure, drives have their share of problems and can go bad, but tapes likewise come with their own set of problems and weaknesses. Because the act of reading and writing a tape involves physical contact, tapes eventually can wear out, they can also be stretched or torn, and if a magnetic field happens to come into close proximity to the tape, you can lose data. If the tapes are kept off-site, then restoring a file will be delayed until the tapes have been physically retrieved.
To sum up, I'd suggest just throwing an extra HD into your box for personal backups. On a corporate system, tapes make much more sense, however, if it's for a corporate application then that expands the scope of the original question signficantly, and would need to include the storage of the backups off-site in a secure location, building fail-over systems, contracting with a disaster recovery center, doing "fire-drills" of restoring your production system on an empty machine, etc. Good luck with it. -- J.W.
So, what's wrong with Iomega Jaz 2 gig? Even tho it's no longer in their product line, you can still get an external drive an a couple of disks for less that $100 USD, if you shop around. It's portable, quick and easy to use, holds bunches of whatever you want it to hold. It's Linux compatible; there are Linux drivers and HOWTO's; and the new Rev drive ($400 or thereabouts; disk $70 or so) go up to 35G of storage.
Originally posted by bigrigdriver So, what's wrong with Iomega Jaz 2 gig? Even tho it's no longer in their product line, you can still get an external drive an a couple of disks for less that $100 USD, if you shop around. It's portable, quick and easy to use, holds bunches of whatever you want it to hold. It's Linux compatible; there are Linux drivers and HOWTO's; and the new Rev drive ($400 or thereabouts; disk $70 or so) go up to 35G of storage.
At the risk of repeating myself, I'm only suggesting that for personal systems, installing a second drive for backup purposes is worth considering. If one has strong opinions for or against that idea, so be it. -- J.W.
Believe it or not, my setup does not cost that much. It's just a complete headache to set up. I came up with this solution to make the hardware assets a non-factor. We use two VIA mini-itx boxes (the cubes) at the remote install locations that cost approximately 500 dollars a pop. They are configured exactly alike so if one in the 2-node cluster fails, we just ship them a new one and all they need to do is plug in the null-modem and 2 network cables to re-establish cluster status (and automatically copies the app & data as well to the newly attached node).
As for documentation on Hearbeat, it's sorely lacking. The Linux-HA group has a few outdated docs online at www.linux-ha.org . I'm thinking of writing a simple HOW-TO from my experiences to help people out.
If your project is for home & for learning, you might also want to look into DRBD, which is network RAID. It was re-redundant for our purposes (is that a pun?) but it's another option.
Also something else to remember : MySQL creates databases in their separate dirs in the data folder. So, instead of fiddling around with master-slave relationship over SSL (which was causing me issues -- MySQL REQUIRES you to compile the db server for SSL connections), I am Rsync-ing the directory instead to remote storage & to the failover. When the first node fails, the second node takes over the IP, hostname, and starts all services configured for the failover (and sends me an email about it)... and voila. there's no breakage in service to the end user (okay, ten seconds max).
As for the scripts, they're just cdrecord commands for the latest backup tar / zip file that CRON created. not a biggie.
We looked hard into tape backups... and came to the conclusion that it was not necessary since we're giving them the manual option to backup all data on DVDRs which are cheaper than tapes / tape drives. I mean, we're giving them 4 levels of data redundancy / backup capabilities including automated diff-based remote storage over TCP/IP regardless of how large their local db becomes... I think that's more than enough.
Fault tolerance and backups are two different things. High availability clusters and RAID ensure maximum uptime but they won't help you with other types of problems. Say a user deletes a bunch of important records from a database and nobody notices for a couple days. With just fault tolerance that mistake would be propagated and you'd have no way to recover the missing data. Backups give you the means to restore the data to a point-in-time prior to the mistake.