LinuxQuestions.org - dmesg: "nfs: server [...] not responding, still trying"

- Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)

- - dmesg: "nfs: server [...] not responding, still trying" (https://www.linuxquestions.org/questions/linux-networking-3/dmesg-nfs-server-%5B-%5D-not-responding-still-trying-667078/)

dmesg: "nfs: server [...] not responding, still trying"

Hi all,

Several days ago I did a clean reinstall of Fedora 9. For some reason, I am now having problems with NFS that I didn't have previously.

venus is my FreeBSD v7 server. I have this line in the fstab of my F9 client:

Code:

venus:/media/disk8 /media/disk8 nfs defaults 0 0

Problem is that after a few hours, nautilus freezes on me. I see these lines in my dmesg output:

Code:

nfs: server venus not responding, still trying

My server is definitely online; I can ping and/or ssh to it. After waiting awhile, my only recourse is to force unmounting the nfs share using umount -l, and I can then restart Nautilus and other programs that may have crashed trying to access the share.

I haven't had these issues previously with this server, nor have I seen the same problem with another client running Ubuntu.

Any ideas?

What about server messages?

Quote:

Originally Posted by Kropotkin (Post 3267144)

Code:

venus:/media/disk8 /media/disk8 nfs defaults 0 0

Problem is that after a few hours, nautilus freezes on me. I see these lines in my dmesg output:

Code:

nfs: server venus not responding, still trying

What messages are there on the NFS server machine from about the same time? Also, what about /var/log/messages on the client machine?

Quote:

Originally Posted by Kropotkin (Post 3267144)

Hi all,

Several days ago I did a clean reinstall of Fedora 9. For some reason, I am now having problems with NFS that I didn't have previously.
Any ideas?

First, check firewall, /etc/hosts.{deny,allow}

So just to clarify, You had this working fine at one point on Fedora 9, but reinstalled F9 onto the same host? Are you sure you installed the same config? Especially iptables (Linux Firewall)?

I'm not sure if Fedora defaults to NFS 2 or 3, but I would try forcing NFS 3 and TCP. I usually use "bg,rsize=32768,wsize=32768,nfsvers=3,tcp,intr" instead of "default"

Quote:

Originally Posted by JonnerR (Post 3267581)

What messages are there on the NFS server machine from about the same time? Also, what about /var/log/messages on the client machine?

I am seeing very sporadically some lines like this:

Code:

Sep  4 20:35:32 venus kernel: ad2: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=49302399

Sep  4 21:05:21 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=190241311

Sep  4 21:35:34 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=287

However, I can't map them timewise to the errors in messages:

Code:

Sep  4 22:01:42 localhost kernel: nfs: server venus not responding, timed out

Sep  4 22:01:42 localhost kernel: nfs: server venus not responding, timed out

Sep  4 22:01:42 localhost kernel: nfs: server venus not responding, timed out

Quote:

Originally Posted by babel17 (Post 3267783)

So just to clarify, You had this working fine at one point on Fedora 9, but reinstalled F9 onto the same host?

Yes

Quote:

Are you sure you installed the same config? Especially iptables (Linux Firewall)?

I didn't restore all the /etc files, but I configured the Fedora client more or less the same. I haven't changed anything on the server for some months.

It is not a firewall issue as I don't use one on the Fedora client, rather I use PF on the FreeBSD box, which also serves as a gateway.

The other odd thing is that I am not seeing any such errors on Ubuntu client accessing the same NFS shares. It seems to be specific to this Fedora box.

Quote:

I'm not sure if Fedora defaults to NFS 2 or 3, but I would try forcing NFS 3 and TCP. I usually use "bg,rsize=32768,wsize=32768,nfsvers=3,tcp,intr" instead of "default"

OK, my fstab entry now looks like this:

Code:

venus:/media/disk8      /media/disk8            nfs    rw,soft,bg,rsize=32768,wsize=32768,nfsvers=3,tcp,intr  0  0

Will see if this helps...

[quote=Kropotkin;3269796]I am seeing very sporadically some lines like this:

Code:

Sep  4 20:35:32 venus kernel: ad2: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=49302399

Sep  4 21:05:21 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=190241311

Sep  4 21:35:34 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=287

This looks like the disk is going bad on the server. If this is the case, the clients will timeout. Use "smartctl -a /dev/sda" (or what ever disk). Be sure to use the raw disk, not the raid mount point. Look for error log entries and failure modes.

[QUOTE=BotKeeper;3270394]

Quote:

Originally Posted by Kropotkin (Post 3269796)

I am seeing very sporadically some lines like this:

Code:

Sep  4 20:35:32 venus kernel: ad2: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=49302399

Sep  4 21:05:21 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=190241311

Sep  4 21:35:34 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=287

Yeah, I was wondering about that, but since I haven't used *BSD much, I don't know what device ad2 is. If there are intermittent disk errors on the server, that could result in inconsistent client behavior, possibly depending on load or even seeming completely random.

[quote=JonnerR;3270725]

Quote:

Originally Posted by BotKeeper (Post 3270394)

The smartmontools package has been ported to some BSDs.

I'm not totally adjusted to BSD, but I believe you need to prefix the device name with "r" (for raw) and maybe postfix with "c" (on Solaris, this was for historical reasons the "entire disk"). So, maybe /dev/rad2c?

solved apparently

Hi again everyone,

About a week ago I upgraded the kernel on F9 system to v2.6.26.3-29 and I haven't seen this issue (NFS timeouts on the client) since. It appears that it was related to the earlier kernel, v2.6.25.14-108.

In any case, thanks for the feedback.