LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (http://www.linuxquestions.org/questions/linux-server-73/)
-   -   MySQL backup - how to deal with large amounts of data? (http://www.linuxquestions.org/questions/linux-server-73/mysql-backup-how-to-deal-with-large-amounts-of-data-862785/)

karll 02-15-2011 04:57 AM

MySQL backup - how to deal with large amounts of data?
 
Hi,

Hopefully someone can give me some good advice on this:

I help run a couple of Internet forums and recently we've been trying to become a bit more serious about backup. It seems the better way to do MySQL backup is to use the binlog. However, that binlog is huge! We seem to produce something like 10Gb per month.

I'd like to copy the backup to somewhere off the server as I don't feel like there is much to be gained by just copying it to somewhere else on the server. I recently made a full backup which after compression amounted to 2.5Gb and took me 6.5 hours to copy to my own computer ... So that solution doesn't seem practical for the binlog backup.

Any ideas? Should we rent another server somewhere? Is it possible to find a server like that really cheap? Or is there some other solution? What are other people's MySQL backup practices?

choogendyk 02-15-2011 07:38 AM

Yup. binlog can get huge. Typically, though, you think of that as an incremental, and you want to run full backups using mysqldump with incrementals in between. See http://dev.mysql.com/doc/refman/5.1/...p-methods.html. You can script that up yourself, with the appropriate table locking, copying the binlog somewhere, and running mysqldump with --flush-logs, but you could also use ZRM for MySQL and manage everything through that. See http://www.zmanda.com/backup-mysql.html.

karll 02-15-2011 08:30 AM

Thanks choogendyk - yes, I have looked into ZRM briefly and might be using that once we have decided on the bigger picture. And yes, running full backups occasionally in between incremental backups makes sense. Because of the size of the databases, I'd prefer not to use mysqldump (which I imagine would create a humonguos ascii file). If mysql is shut down, I should be able to just copy the /var/lib/mysql/ directory, yes? (mysqlhotcopy is out of the question since we have all sorts of table types (InnoDB etc.))

There were some interesting user comments to the mysql article. Does anyone here use Amazon S3 for storage of backups? Or some similar solution?

Also, what is that snapshot filesystem they talk about in one of the user comments? If I understand correctly, it's a way to copy the database really fast? Would that be difficult to set up on a running server like we have?

jcalzare 02-15-2011 08:45 AM

Mysqldump is the best and most appropriate way to take legitimate full backups without using third-party software. Since the output is a text file, you can get very excellent compression using gzip. I guess I don't understand how you think a mysql dump will be larger than trying to rsync your entire mysql directory.

If this is a dedicated server hosted somewhere, you may want to ask them to install an additional hard drive to use for backup space. This way, you can copy mysql dumps (or however you want to do it) to the separate hard drive for safe keeping. This is sufficient day-to-day backups. I would then suggest downloading a copy monthly to your PC for disaster recovery purposes in the event that your datacenter blows up or something.

karll 02-15-2011 01:25 PM

Quote:

Originally Posted by jcalzare (Post 4259064)
Mysqldump is the best and most appropriate way to take legitimate full backups without using third-party software. Since the output is a text file, you can get very excellent compression using gzip. I guess I don't understand how you think a mysql dump will be larger than trying to rsync your entire mysql directory.

I'm thinking the mysql directory is basically "binary" whereas the dump file from mysqldump is "ascii" and with other overhead in addition. Also, it presumably will take a long time to create this dump and then compress it afterwards. If I can simply copy one folder, then that would be better.

Quote:

If this is a dedicated server hosted somewhere, you may want to ask them to install an additional hard drive to use for backup space. This way, you can copy mysql dumps (or however you want to do it) to the separate hard drive for safe keeping. This is sufficient day-to-day backups. I would then suggest downloading a copy monthly to your PC for disaster recovery purposes in the event that your datacenter blows up or something.
Thanks, yes that sounds like a reasonable solution. I might look into that.

sneakyimp 02-16-2011 03:15 AM

First off, I don't think there's much sense in making a backup from one file on a hard drive to another location on that hard drive for a couple of reasons:
* Performance is terrible. The hard drive arm is going to be seeking back and forth reading a file here and copying it over there -- very bad performance! With 10GB backups, you jeopardize the stability of your server! It may crash if the hard drive gets too busy.
* This type of backup is no help at all if your hard drive fails. Your hard drive fails and your original data is gone and your back is gone and you are S.O.L. Same-disk backup will only protect you if you happen to get some data corruption that is localized to a particular file.

Making a backup to a separate hard drive is MUCH faster and should dramatically improve your backup performance speed and, more importantly, you'll be protected in the event of a catastrophic hard drive failure. However, 10GB is still pretty large. It might take minutes just to copy the data under optimal conditions. Still might take too long.

Mysqldump is pretty awesome and is always a viable option. Mysqldump is pretty fast and, depending on what options you choose, may be comparable in size to your original binary data and also more portable. Just copying your folders will mean that your backup needs a mysql setup that is more or less identical to your current setup which may not be easy to set up. Another cool thing about mysqldump is that you could run it from an entirely separate machine (perhaps a cheap, wimpy one whose sole job is making db backups). As long as the network between the machines is fast (say GB ethernet) then it's roughly as fast as one internal to the machine.

Another option is a bare-metal backup service like the "Guardian" product offered by liquidweb (other hosting companies have similar products). The idea with Guardian is that it continuously backs up your system's data to an offsite location. The backup is constantly being updated and is essentially up to the minute and is a backup not just of your database but other files as well (images, source, etc.). It's pricey (I think around $100 or more a month) but seems to be pretty awesome.

chrism01 02-17-2011 01:01 AM

You should be able to create a mysqldump fairly quickly to another disk in the same machine & it guarantees a consistent view of the DB.
It won't matter then if it takes a while to compress it on-site, then txfr to a remote site. The orig DB will still be running just fine.
If you're really paranoid, you can 'nice' down the gzip & txfr job.
ASCII tends to compress very effectively.
Also, as above, your replacement/test systems may not be identical to the live system, but you can edit an ASCII file if reqd. Not possible with binlog. It's just a more flexible solution.

karll 02-18-2011 04:36 AM

Thanks everyone for the input - much appreciated. You all make good points. I now realise that it's possibile my scepticism towards mysqldump is unwarrented, so I will do some tests to see how well it performs. The flexibility it offers is definitely a plus.

trickykid 02-18-2011 10:51 AM

If you're using gzip as the compression, you can also look into bzip2 or even rzip, they might take a little longer to compress but you'll get a better compression ratio so the backups are smaller and easier to manage as well.


All times are GMT -5. The time now is 05:03 AM.