LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (http://www.linuxquestions.org/questions/slackware-14/)
-   -   Frustrating rsync stoppage (http://www.linuxquestions.org/questions/slackware-14/frustrating-rsync-stoppage-4175457420/)

Woodsman 04-08-2013 06:21 PM

Frustrating rsync stoppage
 
I have some scripts to update my local Slackware repos. They are modified versions of Eric's original rsync scripts. Currently I am updating only 14.0 and Current.

The scripts run fine but about half the time the scripts stall. No error messages.

All I see is something like this:

77.30M 80% 392kB/s 0:01:53

The progress status never changes.

I thought the problem might be the upstream server. Changing the server makes no difference.

I run rsync like this:

rsync -vahP --delete

I would appreciate ideas how to debug.

Thanks. :)

jmccue 04-08-2013 06:53 PM

Hi,

Did you try -vvv... ? more v's the more 'debug' info

John

GazL 04-08-2013 07:12 PM

You might also want to consider adding a --timeout=60

Woodsman 04-08-2013 07:35 PM

I will test both options. Thanks. Of course, posting results likely will take a while since the stoppage is not 100% repeatable. Then again, seamonkey is patched almost daily (I'm exaggerating, a wee bit but not by much), so I might see results sooner.

willysr 04-09-2013 10:48 AM

you can also use --progress

Woodsman 04-09-2013 01:06 PM

Quote:

you can also use --progress
See the original post --- I'm using the P option. :)

The P option doesn't expose the stoppage anyway. As I mentioned, the progress information just stops. Nothing more.

Woodsman 05-08-2013 01:20 PM

Been a while since rsync stalled on me but today that happened. As I increased the verbosity and set a timeout, today I saw the following output:

[receiver] io timeout after 60 seconds -- exiting
rsync error: timeout in data send/receive (code 30) at io.c(140) [receiver=3.0.9]
rsync: connection unexpectedly closed (645 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [generator=3.0.9]

The script stalled when updating gwenview 4.10.3 in Current. In this case I was using the oregon state server.

The confusing thing is the script works for a while without errors or warnings and then one day pukes. The rsync web site addresses some of these issues but I'm not seeing anything obvious. The web site lists the following as possible causes:

* The destination disk is full (remember that you need at least the size of the largest file that needs to be updated available in free disk space for the transfer to succeed).
* An idle connection caused a router or remote-shell server to close the connection.
* A network error caused the connection to be dropped.
* The remote rsync executable wasn't found.
* Your remote-shell setup isn't working right or isn't "clean" (i.e. it is sending spurious text to rsync).

The only two that seem to apply in my case are the second and third reasons.

I'm going to try changing --delete to --delete-during as suggested at the rsync web site and use a different server.

Otherwise, any ideas?

gezley 05-10-2013 07:21 AM

Quote:

Originally Posted by Woodsman (Post 4947334)
Been a while since rsync stalled on me but today that happened. As I increased the verbosity and set a timeout, today I saw the following output:

[receiver] io timeout after 60 seconds -- exiting
rsync error: timeout in data send/receive (code 30) at io.c(140) [receiver=3.0.9]
rsync: connection unexpectedly closed (645 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [generator=3.0.9]

The script stalled when updating gwenview 4.10.3 in Current. In this case I was using the oregon state server.

The confusing thing is the script works for a while without errors or warnings and then one day pukes. The rsync web site addresses some of these issues but I'm not seeing anything obvious. The web site lists the following as possible causes:

* The destination disk is full (remember that you need at least the size of the largest file that needs to be updated available in free disk space for the transfer to succeed).
* An idle connection caused a router or remote-shell server to close the connection.
* A network error caused the connection to be dropped.
* The remote rsync executable wasn't found.
* Your remote-shell setup isn't working right or isn't "clean" (i.e. it is sending spurious text to rsync).

The only two that seem to apply in my case are the second and third reasons.

I'm going to try changing --delete to --delete-during as suggested at the rsync web site and use a different server.

Otherwise, any ideas?

Buffering limits with the network adapter? Have you tried limiting the bandwidth used?

Code:

rsync -avz --bwlimit=100  # restrict to 100Kb/sec
Is it a wireless adapter? Ethernet? Realtek chip? Can you try another PCI/PCI-E/PCMCIA/USB adapter?

rouvas 05-11-2013 06:23 PM

I've seen this kind of behaviour before.
It's a network issue. Presumably your ADSL/modem/whatever you use to connect to the interwebs or your ISP resets or the connection is somehow lost and rsync just lingers there forever.

Use the network timeouts options of rsync (as in --timeout=SECONDS and --contimeout), do not use --delete-before and when something goes wrong, rsync will eventually (after SECONDS) will give up.
If all this inside a script, check the exit status and restart the whole process if it is 30 or 35.

Woodsman 05-11-2013 08:15 PM

Just a 1 Gbit on-board NIC and WRT54GL Linksys router.

I use rsync among all of my systems in my home LAN to keep certain files in sync. I don't recall ever seeing a hang. That includes my local router and switch. I am inclined to believe the problem is external. The ISP service is proprietary wireless (rural area) and I do have occasional problems.

I updated my script wrapper to capture and start again when the exit code is 30 or 35. I only use the script when there are changes in patches and current, so testing requires weeks of observation. We'll see how this goes. :-)

hitest 05-11-2013 08:32 PM

Woodsman,

I'll be curious to hear what you find out. I suspect it may be a network issue as I've had very good luck updating -current using the University of Oregon server.

Woodsman 05-11-2013 11:03 PM

Probably the only way I would know for sure of the cause is run wireshark concurrently when I run the script. I don't know exactly what to watch for. Further, as the stoppage occurs randomly and occasionally, I could wait many weeks to catch another stall.

The primary challenge is the ISP wireless system is run on the free 900 MHz band, which means a lot of things could cause interference. Anything from overranged radios to baby monitors use that band.

I do know that somebody in my general areas is using such equipment because I see these random occasional stalls with just basic browsing. One moment the connection is there, then not, then is okay again. The ISP owner knows the electric utility maintenace folks use overranged radios, but they won't admit that. :-)

One problem with this issue is rsync does not seem to have any reconnect-and-retry option. rsync seems poorly designed with that. Thus the exit code capture and while loop seems to be the only work-around. I could revise my script to email me a message any time that happens, but I don't have any way to know the cause of the stoppage unless I do something like monitor with wireshark. The challenge with that is any time there is a large update to the Current tree, I really don't want to sit staring at my monitor for a couple of hours waiting for a possible stall.

Even if I discovered the problem is caused by local radio interference, which is likely, there is nothing I can do except restart the session, which my script should now be able to do.


All times are GMT -5. The time now is 07:17 AM.