Clustering two servers

homa2001 · 02-07-2006, 04:06 PM

I hate to do this, but I could not find a definite answer to my question anywhere so far.
I have done quite a bit of a research on clustering, but from what I have read, I understand that there should be "main" node that manages the load.

I guess it can be more clear if I explain my goal.

I have two identical servers, where powerful (just bought from Dell). I need to cluster them together, so they would look like one server a the user. I am planning on to run web hosting of it ( and there will be another pair for mail hosting). But I feel kind stupid devoting one out of the two for managment.

Please, advice wich clustering solution does not require a central node to manage the load. I do not care how the clustering is done, either its a DNS round robin, or just a process migration, or etc.

Ideally I want to distribute the load between two server that run apache and sync with each other all the time hosting directories, mysql db, etc (same for two qmail servers)

And what ever solution you recomend, please say if it supports adding a new node to the cluster later on (lets say in a half a year)

Thank you so much for you help in advance.

stress_junkie · 02-07-2006, 06:20 PM

First I want to say that you may not need a cluster. You can have two web servers with the same name but different IP addresses. The DNS server has both IP addresses in its database with the same name attached to each. The DNS server will then alternate between the two IP addresses when it is queried about the common name. That is a kind of load balancing called round robin. It's simple but it does not take the load on each node into account.

There are several different kinds of clusters. The kind of cluster that has one controlling computer and the rest are slaves is called a Beowulf cluster. Another kind of cluster is a grid of computers that can send jobs to each other. Open Mosix is a good implementation of this kind of cluster. It has excellent tools to monitor all of the nodes in the cluster and all of the jobs running and queued in the cluster. Another kind of cluster does load balancing and provides continuous service if one node stops working. This is a high availability cluster.

Here are some web pages that should be interesting to you.

http://www.linux-ha.org/HomePage

http://en.wikipedia.org/wiki/Computer_cluster

http://sources.redhat.com/cluster/

homa2001 · 02-08-2006, 01:43 AM

I would prefer the DNS round-robin way, but how do they syncronize? I want so those servers would be also transparent to the users who upload the files, so if the upload it to one, the changes should immedeately be synced with the other one (and later if I add more nodes with all of them)

Thanks,
Andrey

stress_junkie · 02-08-2006, 05:09 AM

Having two web servers that were not clustered would only work if the web site was read only. If they have to share files that are updated by the users then you would have to look at a shared file system. NFS or SMB access to files on a file server would be the easy to implement solution there.

homa2001 · 02-08-2006, 09:51 AM

Right, so with that being said, that means that both of that should be up all time, if I understand right what you are trying to say about NFS or SMB (or at least the NFS/SMB server one). Which destroys the purpose of having mirrored servers. I could probably right a scrip that runs every min and sync 'em, but that is ugly. I would still go with clustering, but do not know what would suite my needs the best.

stress_junkie · 02-08-2006, 01:40 PM

Coincidentally I suggested exactly what you are looking for to someone else recently. You probably would like to have a look at EVMS in a high availability cluster environment. Here is some information.

http://evms.sourceforge.net/install/cluster.html

homa2001 · 02-08-2006, 02:57 PM

Sorry for being annoying, but I kindof was seeing a system consiting of two servers, that would sync everything (if that is possible of course) so it would be transparent even for me, when I create a new hosting user or a new email account, ideally, even if I install or recompile new software i want to do it only once not twice or 3 times ( if I have 3 servers in a cluster).

I read about evms a little bit, and it looks like there has to be a separate server (in my eyes a single point of failure) where user data must be stored.

So my view of desired system is some kind of engine (or kernel mod) that runs on both servers and let them sync changes to each other like domain controllers do in M$ and a DNS server with round-roding algorythm would do that load distribution.

Hope that would make it clear

PS I am now starting to think that I should bring up some sort of domain controller, so I would create a user in the directory service and other linux server joined to the 'domain' would get the user changes, but still need to sync the user space.

Thanks,
Andrey

PSS Looks like Linux-HA has come very close to what I need, I will continue to read and search...

fouldsy · 02-08-2006, 03:21 PM

OpenLDAP could be used as your directory services system, which allows you to provide an LDAP master and then replicate the structure to LDAP slaves at regular intervals in the same manner MS DC's sync AD structures between each others. This would replicate the new user accounts or account detail changes across your servers and any new servers you add-in once you configure them as an LDAP slave.

As for sync'ing the actual data across servers, your idea of a simple rsync script or similar is probably the way to go and not really a dirty solution at all