Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am looking for a solution to do a 1-way sync from a primary server to a secondary server more-or-less in realtime. It can be limited to specific directories since it doesn't need to sync EVERYTHING on the server. We've been using Unison on a 5-minute CRON. The problem is that now our file-base has grown to a point where the Unison process is chewing up our CPU to unacceptable levels. We've explored the possibility of SAN solutions, but the cost is prohibitave. Does anyone have any ideas for how to accomplish this? Thanks!
Are you sure its a single cron kick off chewing up your CPU?
That is to say have you verified the sync that kicked off 5 minutes earlier has completed BEFORE the next one kicks off? If not your issue may simply be a problem with accessing the same resources. I've seen this often where users have multiple copies of the same simple script running and can't figure out from top what is eating up CPU. What is eating up CPU is the contention from multiple runs of the script all contending for exactly the same resources.
If this is the case simply putting a check in the script to verify another copy of the script isn't arleady running might help.
Failing that my suggestion would be NFS as the previous person indicated. The downside to this is the file only exists in one place in reality - if that place goes away (from a disk failure for example) both sides will lose it. With an rsync you have the original and the copy.
Thanks. We need it for high availability more than anything else. You are correct in that it IS running multiple instances of itself and killing the CPU, but we don't want to run it any less often for data protection reasons. After doing some research we're leaning towards DRBD. Do you guys have any experience with this tool? Anything we should watch out for? Thanks!
What operating systems are the two servers running, and how are they connected? I found that nfs file transfers are about twice as fast as samba file transfers.
Have you tried using rsync? I use it via nfs to keep backups of the data on my server (this means using "rsync" as a glorified efficient version of "cp", rather than messing with the rsync daemon and stuff). I don't know if it's performance is better than Unison, but I'm rather impressed by its efficiency on my home systems.
Thanks. We need it for high availability more than anything else. You are correct in that it IS running multiple instances of itself and killing the CPU, but we don't want to run it any less often for data protection reasons. After doing some research we're leaning towards DRBD. Do you guys have any experience with this tool? Anything we should watch out for? Thanks!
I think you missed the point in my post. If a second instance kicks off before the first one completes you're not really getting the file(s) as often as you think. The first one has to complete to free up the resources for the second one to run. Running it so often buys you nothing and as you've seen in terms of performance costs you plenty.
Instead of cron to do the rsync I'd suggest:
1) Create a script that does an infinite while loop:
while true do
rsync ... (whatever your rsync command line is)
sleep 300
done
2) Create another script that verifies the above is running and put this other script in cron (every 15 minutes would do I should think).
The first script does the rsync until the end of time because "true" is always true. The cron job just makes sure the first script restarts if it should fail for any reason (like after a system reboot).
The "sleep 300" says to wait 5 minutes after the first rsync completes until the next one starts. You don't even have to wait 5 minutes. Just putting in "sleep 60" waiting a minute will let it start a minute later. Even a "sleep 1" would be better than nothing.
Also do NOT put an ampersand (&) at the end of the rsync line. The & tells it to run asynchronously. You WANT it to be synchronous so that the next run follows the completion AND the sleep.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.