Database replication for High Availability Standby system

ravishchugh · 04-19-2016, 01:56 AM

Hi all,

I am making a standby High Availibility system with 2 nodes using Pacemaker, Corosync and PCS. The resource is an Apache webserver with SQL database storing dynamic values from a program. This program is running simultaneously on both the PCs and populating their respective databases.

Now, I have to set up a failover mechanism for database. For example, if the main fails and standby takes over, and the main recovers after that, how will the SQL database of main will synchronise with that of standby at bootup.

Regards

tshikose · 04-19-2016, 03:25 AM

Hi,

There are several things to consider that might lead to different but still "working" solution while looking at your problem.

From a HA perspective, you have your apache resource that is handled by pacemaker, so I guess that you also have a filesystem resource for the folder /var/www.
So, you can also have shared filesystem (gluster or clvm) for your database data (normally /var/lib/pgsql if you are using PostgreSQL).

Or you can look at the native mechanism of your database engine.
All recent and decent database engines should have that.

ravishchugh · 04-19-2016, 03:37 AM

Quote:

Originally Posted by tshikose

Hi,

There are several things to consider that might lead to different but still "working" solution while looking at your problem.

From a HA perspective, you have your apache resource that is handled by pacemaker, so I guess that you also have a filesystem resource for the folder /var/www.
So, you can also have shared filesystem (gluster or clvm) for your database data (normally /var/lib/pgsql if you are using PostgreSQL).

Or you can look at the native mechanism of your database engine.
All recent and decent database engines should have that.

Thanx Tshikose, but am really a noob in this matter and didnt quite get what u wanted to imply. Can u please give me some more details or references please.

Regards

tshikose · 04-19-2016, 03:42 AM

Hi,

Basically you have two choices for your database high availability.
You can rely on HA with pacemaker, or on the own database replication capabilities.
What is yours?

Which database engine are you using?

ravishchugh · 04-19-2016, 03:44 AM

Quote:

Originally Posted by tshikose

Hi,

Basically you have two choices for your database high availability.
You can rely on HA with pacemaker, or on the own database replication capabilities.
What is yours?

Which database engine are you using?

I have some experience with SQL DB so will use that sir.

tshikose · 04-19-2016, 04:35 AM

Hi,

While there are plans form Microsoft to port MS SQL Server to Linux, that will not happen before next year.
So, I doubt that your experience with MS SQL Server will be of much help.
At least SQL is SQL everywhere, so there are similarities, but also big differences.

My understanding was that you already have running system for which you wanted to add HA.
Now, it seems as if you are not yet there.
If you can I advise you to use Postgres as database engine.

What is your experience with Linux HA?

ravishchugh · 04-19-2016, 04:45 AM

I have never worked on postegres SQL before but if its advisable...ill shift on that.
I have successfully setuo my cluster on 2 nodes with httpd apache resource and its working fine.
Now i have to include the database part and make the webpage dynamic.

tshikose · 04-19-2016, 05:00 AM

Hi,

That is good.
Try to build a Postgres resource similar to httpd with a shared folder /var/lib/pgsql.
I have never done that (just because I never came in a situation where I needed to build a HA DB), so I am just sharing guidelines.

ravishchugh · 04-19-2016, 05:02 AM

Thanx sir,
Ill try doing that.
Do u have any reference or tutorial for this?

Regards

tshikose · 04-19-2016, 05:11 AM

Official Red Hat Enterprise Linux 7 documentation. Look at Clustering section.

ravishchugh · 04-21-2016, 01:48 AM

Quote:

Originally Posted by tshikose

Official Red Hat Enterprise Linux 7 documentation. Look at Clustering section.

Sir I had a look at the documentation.

I have a small doubt regarding my scenario.

I have the same application running in 2 PCs - one main and one standby - and this application is updating the SQL DB and same is queried by the client via a webpage.
The solution given in the documentation replicates the main DB into standby DB at all times whereas my requirement is of replication only once when the main system recovers from a failover and need to replicate data from standby for the time it was down.

Is there a solution for this kind of scenario sir?

Regards

tshikose · 04-21-2016, 01:57 AM

Hi,

Database replication generally involves huge data.
So, trying to start a replication when a failover occcurs is not pratical. The replication time will just take long enough, keeping the service unavailable, and then defeating the goal of HA.
You need to permanently replicate the data between your two database instances. Like that when a failover occurs, just few not yet replicated data need to be shipped across.

Googling a little for "Linux Postgres replication" gave me this.

ravishchugh · 04-21-2016, 02:04 AM

Quote:

Originally Posted by tshikose

Hi,

Database replication generally involves huge data.
So, trying to start a replication when a failover occcurs is not pratical. The replication time will just take long enough, keeping the service unavailable, and then defeating the goal of HA.
You need to permanently replicate the data between your two database instances. Like that when a failover occurs, just few not yet replicated data need to be shipped across.

Googling a little for "Linux Postgres replication" gave me this.

Will this take care of my HA needs too?
I mean, with this implemented, if my main system goes down then the client will be automatically redirected to standby database or do i still need to implement HA using pacemaker and corosync sir?

tshikose · 04-21-2016, 02:14 AM

Hi,

Yes, I think that you can have a pgsql resource managed by your HA pacemaker.
And if the database information normally located in /var/lib/pgsql/data iis well synced by the Postgres replication, then the pgsql resource should take over where the previous left.

ravishchugh · 04-21-2016, 02:17 AM

Sir can u pls explain this point...its making some sense but not fully understood by me...sorry for the trouble.