LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   dmesg: "nfs: server [...] not responding, still trying" (https://www.linuxquestions.org/questions/linux-networking-3/dmesg-nfs-server-%5B-%5D-not-responding-still-trying-667078/)

Kropotkin 09-02-2008 08:07 AM

dmesg: "nfs: server [...] not responding, still trying"
 
Hi all,

Several days ago I did a clean reinstall of Fedora 9. For some reason, I am now having problems with NFS that I didn't have previously.

venus is my FreeBSD v7 server. I have this line in the fstab of my F9 client:
Code:

venus:/media/disk8      /media/disk8            nfs    defaults      0  0
Problem is that after a few hours, nautilus freezes on me. I see these lines in my dmesg output:
Code:

nfs: server venus not responding, still trying
My server is definitely online; I can ping and/or ssh to it. After waiting awhile, my only recourse is to force unmounting the nfs share using umount -l, and I can then restart Nautilus and other programs that may have crashed trying to access the share.

I haven't had these issues previously with this server, nor have I seen the same problem with another client running Ubuntu.

Any ideas?

JonnerR 09-02-2008 03:56 PM

What about server messages?
 
Quote:

Originally Posted by Kropotkin (Post 3267144)
Hi all,

Several days ago I did a clean reinstall of Fedora 9. For some reason, I am now having problems with NFS that I didn't have previously.

venus is my FreeBSD v7 server. I have this line in the fstab of my F9 client:
Code:

venus:/media/disk8      /media/disk8            nfs    defaults      0  0
Problem is that after a few hours, nautilus freezes on me. I see these lines in my dmesg output:
Code:

nfs: server venus not responding, still trying
My server is definitely online; I can ping and/or ssh to it. After waiting awhile, my only recourse is to force unmounting the nfs share using umount -l, and I can then restart Nautilus and other programs that may have crashed trying to access the share.

I haven't had these issues previously with this server, nor have I seen the same problem with another client running Ubuntu.

Any ideas?

What messages are there on the NFS server machine from about the same time? Also, what about /var/log/messages on the client machine?

BotKeeper 09-02-2008 04:00 PM

Quote:

Originally Posted by Kropotkin (Post 3267144)
Hi all,

Several days ago I did a clean reinstall of Fedora 9. For some reason, I am now having problems with NFS that I didn't have previously.
Any ideas?

First, check firewall, /etc/hosts.{deny,allow}

babel17 09-02-2008 08:09 PM

So just to clarify, You had this working fine at one point on Fedora 9, but reinstalled F9 onto the same host? Are you sure you installed the same config? Especially iptables (Linux Firewall)?

I'm not sure if Fedora defaults to NFS 2 or 3, but I would try forcing NFS 3 and TCP. I usually use "bg,rsize=32768,wsize=32768,nfsvers=3,tcp,intr" instead of "default"

Kropotkin 09-04-2008 03:32 PM

Quote:

Originally Posted by JonnerR (Post 3267581)
What messages are there on the NFS server machine from about the same time? Also, what about /var/log/messages on the client machine?

I am seeing very sporadically some lines like this:
Code:

Sep  4 20:35:32 venus kernel: ad2: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=49302399
Sep  4 21:05:21 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=190241311
Sep  4 21:35:34 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=287

However, I can't map them timewise to the errors in messages:
Code:

Sep  4 22:01:42 localhost kernel: nfs: server venus not responding, timed out
Sep  4 22:01:42 localhost kernel: nfs: server venus not responding, timed out
Sep  4 22:01:42 localhost kernel: nfs: server venus not responding, timed out

Quote:

Originally Posted by babel17 (Post 3267783)
So just to clarify, You had this working fine at one point on Fedora 9, but reinstalled F9 onto the same host?

Yes
Quote:

Are you sure you installed the same config? Especially iptables (Linux Firewall)?
I didn't restore all the /etc files, but I configured the Fedora client more or less the same. I haven't changed anything on the server for some months.

It is not a firewall issue as I don't use one on the Fedora client, rather I use PF on the FreeBSD box, which also serves as a gateway.

The other odd thing is that I am not seeing any such errors on Ubuntu client accessing the same NFS shares. It seems to be specific to this Fedora box.
Quote:

I'm not sure if Fedora defaults to NFS 2 or 3, but I would try forcing NFS 3 and TCP. I usually use "bg,rsize=32768,wsize=32768,nfsvers=3,tcp,intr" instead of "default"
OK, my fstab entry now looks like this:
Code:

venus:/media/disk8      /media/disk8            nfs    rw,soft,bg,rsize=32768,wsize=32768,nfsvers=3,tcp,intr  0  0
Will see if this helps...

BotKeeper 09-05-2008 04:48 AM

[quote=Kropotkin;3269796]I am seeing very sporadically some lines like this:
Code:

Sep  4 20:35:32 venus kernel: ad2: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=49302399
Sep  4 21:05:21 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=190241311
Sep  4 21:35:34 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=287

This looks like the disk is going bad on the server. If this is the case, the clients will timeout. Use "smartctl -a /dev/sda" (or what ever disk). Be sure to use the raw disk, not the raid mount point. Look for error log entries and failure modes.

JonnerR 09-05-2008 12:05 PM

[QUOTE=BotKeeper;3270394]
Quote:

Originally Posted by Kropotkin (Post 3269796)
I am seeing very sporadically some lines like this:
Code:

Sep  4 20:35:32 venus kernel: ad2: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=49302399
Sep  4 21:05:21 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=190241311
Sep  4 21:35:34 venus kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=287

This looks like the disk is going bad on the server. If this is the case, the clients will timeout. Use "smartctl -a /dev/sda" (or what ever disk). Be sure to use the raw disk, not the raid mount point. Look for error log entries and failure modes.

Yeah, I was wondering about that, but since I haven't used *BSD much, I don't know what device ad2 is. If there are intermittent disk errors on the server, that could result in inconsistent client behavior, possibly depending on load or even seeming completely random.

BotKeeper 09-06-2008 10:45 AM

[quote=JonnerR;3270725]
Quote:

Originally Posted by BotKeeper (Post 3270394)

Yeah, I was wondering about that, but since I haven't used *BSD much, I don't know what device ad2 is. If there are intermittent disk errors on the server, that could result in inconsistent client behavior, possibly depending on load or even seeming completely random.

The smartmontools package has been ported to some BSDs.

I'm not totally adjusted to BSD, but I believe you need to prefix the device name with "r" (for raw) and maybe postfix with "c" (on Solaris, this was for historical reasons the "entire disk"). So, maybe /dev/rad2c?

Kropotkin 09-27-2008 04:46 AM

solved apparently
 
Hi again everyone,

About a week ago I upgraded the kernel on F9 system to v2.6.26.3-29 and I haven't seen this issue (NFS timeouts on the client) since. It appears that it was related to the earlier kernel, v2.6.25.14-108.

In any case, thanks for the feedback.


All times are GMT -5. The time now is 07:18 AM.