Best way to share files between clustered servers?
I'm going to be setting up a load balanced cluster consisting of between 2 and 4 front end servers, a DB server, and a few other servers. I have everything figured out except keeping the files synced.
I was thinking of putting the web root directories on a NAS share on all the servers, but I fear this will not be fast enough and will not scale. From what I understand of SAN, it wont work, because all servers will need to be able to write the data... I don't really want to RSync between the servers, because that creates a PITA situation because I would need to designate one of them primary to all rsync off of, in which case if that one fails it stops syncing (otherwise it gets really complicated having to rsync between 2 servers for each)... The other thing I thought of, is mount a NAS share (say as /www_dist), and rsync from that to the physical server itself (say, /www on sda). So if I want to add a server, I don't need to touch any of the others (and there will be no performance hit since the HTTP servers is reading off the local SCSI drive instead of the NAS). (Oh, I would have a hot backup for the NAS server anyway)... Any input? |
I think what you want is Heartbeat (AKA Linux HA) and DRBD
|
Clustering file systems...DRBD...
I am hoping to learn real, good, straightforward, low-complexity ways to do this which are not horribly expensive! So far, my input has been:
http://en.wikipedia.org/wiki/DRBD http://www.howtoforge.com/high_avail...drbd_heartbeat http://sourceforge.net/projects/crablfs In other words: real, probably very good, not terribly straightforward, and not low-complexity :-) The closest thing I have found, I think, is probably GFS, under Redhat Enterprise 5 only. It will let you write to the same disk (iSCSI perhaps, or simply SCSI even) from two independent servers. If you look up "Clustering File Systems" in Wikipedia there is more input of this kind. J.E.B. |
Quote:
|
Ok wait, I think I see what you're doing here now.
What I would do (and this isn't mine, so to each their own) is set up heartbeat and drbd on my storage systems, much as you've described. But I would then just NFS the drives to the servers rather than rsync them. You already have a 1-1 redundancy on your storage, after all. If you're dealing with database intensive stuff, and you're databases aren't prohibitively huge, you'd do better to leave this out of the setup and just do Master/Master replication on all of the servers directly. The reason I say this is it'll allow each server to talk to localhost directly for database queries. If you try to network database queries you'll see a huge lag in query response times. For a backup, you might do Master/Slave to your storage device. Realistically speaking, if you're doing more than 1-1 redundancy (1 live, 1 hot standby), you're not going to see any real benefit until you start implementing geographic redundancy. This is because, generally speaking, if both of your servers go down at the same time for some reason, you probably have bigger issues that are affecting all machines on that network, such as power outage or dropped internet connections (which, if you're setting things up right, would require at least 2 firewalls to go down too). Since you've already got a 1-1 redundancy on your storage, and you'll be setting up 1-1 redundancy on your servers, I don't see where you'd need rsync to come into the picture. As I said, this isn't mine, and everyone has their own way of doing things, but offhand, I think this is the way I'd go with it (given the current level of information). |
Quote:
Quote:
Quote:
Internet | Load Balancer (redundant) --|-----|-----| Web1 Web2 Web3 .\_____|_____/ ...|...|...| ..DB1..|..NAS1 ...\___|___/ .......| ....Backup Where the backup uses heartbeat to monitor NAS1, and take over nas functions should it go down, and it also monitors DB1, and switches from slave to master should it go down. The DB, NAS and Backup servers will not be connected to the internet at all (will be on a private VPN). Redundancy here is an afterthought, the primary reason for this setup is to deal with high load. The traffic on the site presently reaches 800req/s, and with the planned additions should generate up to 2,000 req/sec. |
I would definitely do as I suggested with the database servers or you're going to create a huge bottleneck there. At the very least, set up another DB server to share the load.
Rsync will work, but you'll find DRBD to be faster with replication. With that level of load, I wouldn't use NFS and I'd also be converting data to static pages wherever possible. All of the back end servers should be on a "real" private network, preferably fiber or at least gigabit. Otherwise you'll flood your own network with backup traffic. |
Quote:
Quote:
Quote:
|
All times are GMT -5. The time now is 12:11 PM. |