Quote:
Originally Posted by syg00
What changed ?.
|
Nothing, as far as I can see. Although let me tell you more of the story so maybe you can tell me what could have changed
The fileserver used to be running an ancient version of OpenSUSE (I think it was 10.3) and exported using NFSv3. I never ran an update, because I never rebooted the system. I know that no updates would run because I actually disabled the update system on that machine.
One day I lost power, and the fileserver rebooted. I took the opportunity to update the system software on the client (Arch Linux) machine, but I didn't do anything to the server.
For some reason, after the client came back up, it couldn't connect to the server at all. So I ssh'd into the server and restarted the portmapper and idmapd daemons, and viola. It worked.
Except that now I was getting these timeouts. It only happened when copying a lot of data (usually over a GB), which I don't do very frequently. But it was annoying, and sometimes my backup cronjob would die, putting my data at risk.
I figured it was due to the ancient version of nfsd and Linux I was running on the fileserver, so I bought a new hard drive, installed Arch Linux on it, and freshened the whole thing up to modern. It's now on Linux 3.8.6 and has the most recent (as of about two weeks ago) versions of nfsd, etc. I still don't run updates and don't reboot, but at least now the baseline is more current.
However, even after all this the problem still persists.
I've tried replacing the switch and re-cabling the network, but that doesn't seem to have helped. Also, the problem only shows up with NFS writes (not reads) to the fileserver. I can read data all day long, I can sftp data to the fileserver with no problems, etc. The problem also shows up if I mount the volume on other computers on the network, so it's not just localized to my main desktop PC.
So although I say "nothing changed", as you can see after the problem started happening I changed almost everything, and the issue still persists.