LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   NFS hangs (server: 2.4.21, client: 2.6.6) (https://www.linuxquestions.org/questions/linux-networking-3/nfs-hangs-server-2-4-21-client-2-6-6-a-188062/)

grabner 05-31-2004 12:55 PM

NFS hangs (server: 2.4.21, client: 2.6.6)
 
Hi!

The NFS client repeatably hangs after transfer of ~1MB in the following configuration:

NFS server: SuSE 9.0 (with SuSE kernel 2.4.21-99), runs on a P-III 1GHz with 256MB RAM

NFS client: SuSE 9.1 with the shipped kernel (2.6.4) replaced by a standard 2.6.6 kernel with identical configuration (from /proc/config.gz) due to random hangs during boot (RAID initialization). Tested on three (very) different systems, so the problem seems to be hardware-independent.

The kernel upgrade fixed the boot problem, but introduced the problem with NFS (which did not occur with the SuSE kernel 2.6.4). I can mount the NFS directories exported by the 2.4.21 server on the 2.6.6. client, list directory contents and access small files. However, when reading a large file, transfer stops after ~1MB. It requires a reboot to bring up NFS again. The problematic processes on the client seem to be "rpciod" and "lockd", since they survive both "init 1" and "kill -9". They appear like this with "ps":

9366 ? S 0:00 [rpciod]
9367 ? S 0:00 [lockd]

The client log file (node name "gx5") includes the complaint by "mount" about being older than the kernel, and some messages about the server (despite the "OK" messages there was no response):

-----------------
May 31 18:36:16 gx5 kernel: nfs warning: mount version older than kernel
May 31 18:38:21 gx5 kernel: nfs: server gateway not responding, still trying
May 31 18:38:21 gx5 last message repeated 15 times
May 31 19:01:39 gx5 kernel: nfs: server gateway OK
May 31 19:01:39 gx5 last message repeated 14 times
May 31 19:03:24 gx5 kernel: nfs: server gateway not responding, still trying
May 31 19:03:24 gx5 last message repeated 10 times
-----------------

The server log file (node name "gateway") shows nothing suspicious (besides the assigned port number "666", which is random in this case, I guess :)):

-----------------
May 31 18:36:19 gateway rpc.mountd: authenticated mount request from gx5.home.at:666 for /disk/gateway/audio (/disk/gateway/audio)
-----------------

Ok, so I guess the reason is either a kernel bug or a severe misconfiguration. But since I have never digged that deep into the networking system, I have no idea which tools to replace (i.e., recompile with the new 2.6.6. kernel) to fix the problem. I should probably start with "mount", but before I go through this by trial&error, I would like to ask the networking gurus out there for a hint which part of the system is so confused by a kernel upgrade from 2.6.4 to 2.6.6. to cause the problem described above.

Thanks in advance,
Markus


All times are GMT -5. The time now is 01:21 AM.