LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Reply
  Search this Thread
Old 06-21-2005, 09:12 PM   #1
bretticus
Member
 
Registered: Nov 2003
Distribution: Debian 3.1
Posts: 36

Rep: Reputation: 15
Linux real-time file replication/synchronization


I have been looking for some time now to find more information on how to replicate data between servers. What I'm facing is a hardware load balancer to pull from two or more servers. Since NFS would present a single point of failure, I do not want to use it; redundancy, which is the main goal would be compromised here. I could use Rsync, but I need to implement real-time replication so that clients will be able to view their uploads/updates instantly and not be confused. Also, I have GBs of data that Rsync would need to synchronize on a timely basis (very resource intensive.) I have looked at FAM and IMON but this is a very old tutorial and the SGI::FAM is so old that installation is no longer possible (I really don't want to take up the torch either as I'm no perl developer.) I understand FAM provides an API so that apps can use it to monitor file changes, but I am no C developer either! My point being there's got to be someone out there who has done real-time file replication for linux servers in a load balanced environment. Any tips will be well appreciated!
 
Old 06-22-2005, 02:12 PM   #2
javaroast
Member
 
Registered: Apr 2005
Posts: 131

Rep: Reputation: 19
Unison

I'm not exactly sure if this is what you are looking for or if you want to have fail over capabilities as well. But for simple file synchroniztion this will work http://www.cis.upenn.edu/~bcpierce/unison/
 
Old 06-22-2005, 04:04 PM   #3
bretticus
Member
 
Registered: Nov 2003
Distribution: Debian 3.1
Posts: 36

Original Poster
Rep: Reputation: 15
Many thanks!

I did look at unison. I discounted it after it seemed to me that it really isn't for real-time file replication. I will read up on it to see if possibly I can get it to work in the fashion I need it to. In the meantime, do you know of a way to implement it in a real-time (or as close as I can get) fashion?
 
Old 06-22-2005, 04:26 PM   #4
stefan_nicolau
Member
 
Registered: Jun 2005
Location: Canada
Distribution: Debian Etch/Sid, Ubuntu
Posts: 529

Rep: Reputation: 32
Have you looked at
http://www.acronis.com/enterprise/products/ATISLin/
the catch? $700

http://www.redhat.com/software/rha/gfs/

http://www.linuxfocus.org/English/Ma...ticle199.shtml
old... uses fam

http://www.alphaworks.ibm.com/tech/vitalfilebackup

If you find a solution, please post it, because I am looking for the same thing, though my real-time requirements are lower.
 
Old 06-22-2005, 11:58 PM   #5
bretticus
Member
 
Registered: Nov 2003
Distribution: Debian 3.1
Posts: 36

Original Poster
Rep: Reputation: 15
Quote:
Have you looked at
http://www.acronis.com/enterprise/products/ATISLin/
the catch? $700
No, but this seems to be backup software and not mirroring. Please let me know if I'm wrong.

I have looked at GFS. It is my understanding that GFS is wonderful but expensive (extra hardware to buy.) However, I am not familiar with Redhat's flavor of GFS. Also, this may not represent a high availability solution. Again... let me know if I'm wrong.

I actually posted this link in my original post. It looked like it was just what I was looking for. Since I have a 2.4.x kernel, IMON (the kernel feature has a different name but does the same thing as I recall) gives files the ability to fire an event that FAM listens for. The problem is, of course, that the perl module mentioned that hooks into FAM (via API) does not install due to dependencies that cannot be met now (again, I forgot the details.) I suppose if I were really desperate (or ambitious), I would attempt to write my own. Another issue is the fact that the perl module was not updated for so long (is this a less-than-ideal way to go about this?)

This one is very interesting. Replication was not the end goal but it may work. Allthough, it's not open-source and therefore there may be better non-free alternatives, such as:

Repliweb
or...
PeerSync

However, I just can't believe with Load balancers out there (including LVS ) there is no implementation for doing realtime file synchronization for your web server farm.
 
Old 07-01-2005, 02:48 AM   #6
RandomLinuxNewb
Member
 
Registered: Oct 2003
Distribution: Slackware
Posts: 101

Rep: Reputation: 15
I was just looking for the same thing and I found this http://www.drbd.org/. Sounds very promising, when I start working with it I'll post my results.
 
Old 07-05-2005, 10:46 AM   #7
bretticus
Member
 
Registered: Nov 2003
Distribution: Debian 3.1
Posts: 36

Original Poster
Rep: Reputation: 15
Talking

Awesome, this looks the most promising so far. It may be tricky to implement with my existing production servers (I will need to scrounge up some boxes to test with.) However, this looks like it could be THE ANSWER I will also post my experience.

Thanks all!
 
Old 07-12-2005, 04:07 PM   #8
bretticus
Member
 
Registered: Nov 2003
Distribution: Debian 3.1
Posts: 36

Original Poster
Rep: Reputation: 15
Unhappy

DAMN!

Got DRBD working on Debian servers. After all the toil, it dawns on me that the secondary block device cannot be mounted. Instead this system is meant to do a switchover in the case the primary goes down (via heartbeat.) As for the ro on secondary, this is from the drbd website:

Quote:
Do not attempt to mount a drbd in Secondary state.

On 2.6 kernels, we don't allow it. Though (on 2.4 kernels) it is still possible to mount a Secondary device readonly, changes made to the Primary are mirrored to it underneath the filesystem and buffer-cache of the Secondary, so you won't see changes on the Secondary. And changing meta-data underneath a filesystem is a risky habit, since it may confuse your kernel to death. So don't do that.

Symptoms would be loads of Assert (mdev->state == Primary) in syslog.
So it seems, this IS NOT the answer for file synchronization for one or more load balanced servers. Unless, I am mistaken, I'm back to the drawing board {SIGH}
 
Old 07-13-2005, 06:30 PM   #9
bretticus
Member
 
Registered: Nov 2003
Distribution: Debian 3.1
Posts: 36

Original Poster
Rep: Reputation: 15
Okay, I found some NON-free software that works very well. Only problem, it's fairly expensive: PeerFS. It does what I hoped drbd would do: that is, the files in a network block device are distributed to all nodes simultaneously and they are all instantly available for reading. The setup was extremely easy. The only downside is the dough (WAY cheaper than hardware based solutions though!) Does anybody know of opensource alternatives?
 
Old 10-21-2005, 10:21 PM   #10
bretticus
Member
 
Registered: Nov 2003
Distribution: Debian 3.1
Posts: 36

Original Poster
Rep: Reputation: 15
In case any thread subscribers are interested on how things turned out...well, they didn't really. PeerFS was a nightmare and another solution (that seems to work well) costs anywhere from 2-4 times as much as PeerFS (Constant Replicator.)

Here's a more detailed recent post
 
Old 10-22-2005, 02:54 AM   #11
amitsharma_26
Member
 
Registered: Sep 2005
Location: New delhi
Distribution: RHEL 3.0/4.0
Posts: 777

Rep: Reputation: 31
How abt RSYNC !

http://sunsite.dk/info/guides/rsync/...mirroring.html

http://amitsharma.linuxbloggers.com/how_to_rsync.htm

Last edited by amitsharma_26; 09-05-2006 at 06:51 AM.
 
Old 10-22-2005, 01:04 PM   #12
bretticus
Member
 
Registered: Nov 2003
Distribution: Debian 3.1
Posts: 36

Original Poster
Rep: Reputation: 15
Quote:
How abt RSYNC !
I do love rsync! BUT...it's not an alternative for realtime synchronization. I use it currently for my secondary peers that HAVE to be hot spares because I can only feasibly replicate every 2 hours or so (takes like 5-10 minutes just to count my 1.5 million files!) and with websites constantly changing, my clients are gonna get pissed they upload and can't see changes on a secondary load balanced peer for up to 2 hours!
 
Old 10-18-2007, 02:53 AM   #13
Chowroc
Member
 
Registered: Dec 2004
Posts: 145

Rep: Reputation: 15
Quote:
Originally Posted by bretticus View Post
I have been looking for some time now to find more information on how to replicate data between servers. What I'm facing is a hardware load balancer to pull from two or more servers. Since NFS would present a single point of failure, I do not want to use it; redundancy, which is the main goal would be compromised here. I could use Rsync, but I need to implement real-time replication so that clients will be able to view their uploads/updates instantly and not be confused. Also, I have GBs of data that Rsync would need to synchronize on a timely basis (very resource intensive.) I have looked at FAM and IMON but this is a very old tutorial and the SGI::FAM is so old that installation is no longer possible (I really don't want to take up the torch either as I'm no perl developer.) I understand FAM provides an API so that apps can use it to monitor file changes, but I am no C developer either! My point being there's got to be someone out there who has done real-time file replication for linux servers in a load balanced environment. Any tips will be well appreciated!
Now Linux has a function like FAM, maybe more powerful, that is
inotify.

Recently I started an open source project "cutils" on the
sourceforge:
https://sourceforge.net/projects/crablfs/

The document can be found at:
http://crablfs.sourceforge.net/#ru_data_man

This project's mirrord/fs_mirror tools is a near realtime file system
mirroring application across 2 or more hosts, to mirrors the many
small files
from one to another as soon as possible when there is
any modification.

mirrord/fs_mirror makes use of inotify, which is a function
afforded by the recent Linux kernel (from 2.6.12). I think it is a
counterpart of FAM, since Linux FAM has stopped so long.

I started these projects on LFS first, but these tools can also be
applied on other distributions, for instance, I used them also on the
RHEL4 environment.

I hope these tools can be useful to you.

Thanks.

Last edited by Chowroc; 10-18-2007 at 02:54 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Linux real-time file replication woes bretticus Linux - Enterprise 5 01-17-2006 09:44 PM
Linux realtime file replication bretticus Linux - Software 4 11-03-2005 09:28 AM
time synchronization on HPUX11 ntoughe Other *NIX 2 10-25-2005 10:46 AM
Time Synchronization mykrob SUSE / openSUSE 4 10-23-2004 04:09 PM
email synchronization (not file synchronization) Moebius Linux - Software 6 10-05-2004 05:31 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 07:37 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration