Is it possible to acheive HA on a MYSQL replicated setup?
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Is it possible to acheive HA on a MYSQL replicated setup?
Currently my database server running mysql is on RAID 1, and I have another server querying it to do a daily backup. So that's the only protection I have.
I am looking into doing a more active backup, so exploring replication. But I realise replication don't do auto failover... or so I read.
Is there a way to achieve it? To achieve auto failover, load-balanced environment so as to ensure HA?
Currently my database server running mysql is on RAID 1, and I have another server querying it to do a daily backup. So that's the only protection I have.
I am looking into doing a more active backup, so exploring replication. But I realise replication don't do auto failover... or so I read.
Is there a way to achieve it? To achieve auto failover, load-balanced environment so as to ensure HA?
Can anyone please advice?
thanks!
I've run database-driven stuff for years. HA is over-priced and over-rated in all but the most extreme cases. A cheap-o 1U PIII on E-Bay is plenty powerful enough for a surprising number of cases, and will usually deliver 3-4 nines (99.9% to 99.99%) on a shoestring.
Once you get ego out of the way, you'd be surprised how rarely anybody actually needs more than this. At 99.99%, you have about 8 hours of downtime per year. Be honest - what would happen if your system was down 1 day every year or so?
To go HA and get to 5 nines 99.999% is less than 1 hour per YEAR.
And the price is constant vigilance. If hiring a qualified technician full-time in order shave out that 1 or 2 business days per YEAR is out of the question, it's unlikely that you need to worry about it.
Having any system that "cuts over" automatically in a failure is a tremendous pain in the arse. A techie at the datacenter fat-fingers the switch on a power strip, and the 5 minutes of downtime on your database server morphs into a sleepless night rebuilding your primary database on the server and resetting all your logic servers to use the primary DB server again.
Yuck.
My advice? Write a script that backs up your database every hour and copies it to a remote location with scp or rsync over ssh. If you want, you can have a "hot" backup cheezo PIII that loads the database hourly as well, so that if you have to cut over, you change a setting on your web servers and you're done.
Currently, I am already running a spare machine (doing RAID 5 though, hehe) which draws the database from the live production server everyday (5am) using scp over rsync. So that gives me 24hr backup at best (should my RAID1 fail). Can't do it hourly as our dB is busy most of the time.
I'm currently studying HA and realise... I can't just do it with just any distro. I need a cluster-able Distro to do it like Redhat Cluster suite.. to achieve my HA and LB... I suppose that's what you meant by the 'feasibility' factor.
Thus I am looking at ways to backup my dB on a 'live' basis which leads me to exploring replication. But while that gives me 'almost' by the minute backup, it doesn't roll over automatically, and hence as you mention.. probably cater to that 8 hours of downtime per year. It actually downed more than 8hrs this year because the sheer amount of traffic and database is huge.. but well..
so.. do u think I should just rely on replication.. and manually point to the 'slave' machine should it fail... or...? because initially i thought a 'master-slave' replication setup meant that the slave would kick in if the master goes down. Apparently not.
But as I said.. noted your points. very logical indeed
Quote:
Originally Posted by mcrbids
I've run database-driven stuff for years. HA is over-priced and over-rated in all but the most extreme cases. A cheap-o 1U PIII on E-Bay is plenty powerful enough for a surprising number of cases, and will usually deliver 3-4 nines (99.9% to 99.99%) on a shoestring.
Once you get ego out of the way, you'd be surprised how rarely anybody actually needs more than this. At 99.99%, you have about 8 hours of downtime per year. Be honest - what would happen if your system was down 1 day every year or so?
To go HA and get to 5 nines 99.999% is less than 1 hour per YEAR.
And the price is constant vigilance. If hiring a qualified technician full-time in order shave out that 1 or 2 business days per YEAR is out of the question, it's unlikely that you need to worry about it.
Having any system that "cuts over" automatically in a failure is a tremendous pain in the arse. A techie at the datacenter fat-fingers the switch on a power strip, and the 5 minutes of downtime on your database server morphs into a sleepless night rebuilding your primary database on the server and resetting all your logic servers to use the primary DB server again.
Yuck.
My advice? Write a script that backs up your database every hour and copies it to a remote location with scp or rsync over ssh. If you want, you can have a "hot" backup cheezo PIII that loads the database hourly as well, so that if you have to cut over, you change a setting on your web servers and you're done.
Thus I am looking at ways to backup my dB on a 'live' basis which leads me to exploring replication. But while that gives me 'almost' by the minute backup, it doesn't roll over automatically, and hence as you mention.. probably cater to that 8 hours of downtime per year. It actually downed more than 8hrs this year because the sheer amount of traffic and database is huge.. but well..
so.. do u think I should just rely on replication.. and manually point to the 'slave' machine should it fail... or...? because initially i thought a 'master-slave' replication setup meant that the slave would kick in if the master goes down. Apparently not.
Like I've said - replication mostly REDUCES uptime, not improves on it, because there are so many things that can go wrong. I've seen more problems due to replication errors, partial switchover failures, partial failures, and network burps causing more downtime than I've ever seen caused by even catastrophic failure. (EG: motherboard catching fire)
If you are really sure you want to try HA, my suggestion would be to go ahead and replicate to your backup host, and don't actually use your backup host. If your primary fails, then reconfig your backup host as the primary, and change your logic/web servers to use the backup host manually.
I'd suggest a set of scripts (I'd use SSH with RSA keys so it's automatic) that do this all in one fell swoop, to switch from production to failover, and back again. Test them at least monthly, at night or something. Automate the test, as well, so that it's easily enough done that you might actually do it on a regular basis.
HA is non trivial, and I've never seen the business case where it was actually warranted. If you can't justify a full-time DBA position to make sure that database is 100% 24x7, you probably should be looking at having a hot backup system and manual failover, manually propogated every hour or so, with a promise of 1-2 hour turnaround during business hours in the case of a failure.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.