Weird retransmission problem with NFv3 over TCP between debian wheezy and Netapp
Linux - NetworkingThis forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
two packets of the NFS connections get lost (reason yet to find) - 4251210111 and 4251219059
Netapp sends duplicate ACK with the sequence number of first lost packet in ack field (as far as i know this should trigger a fast-retransmit) - seq 1681885773+ and ack 4251210111
more write requests from the client, ACKed with SACK by Netapp (ACK field still the sequence number of lost packet) - seq 1681885773 onwards
12s nothing - client-VM frozen since nearly all filesystem operations hang (only two connections that handle all the NFS traffic and there are no other filesystems used except tmpfs)
SYN from client (trying to reestablish the connection)
another ACK from netapp with ack 4251210111 (still waiting for a retransmit)
RST from client (wants to reestablish the connection)
73s nothing (client-VM still nearly completely frozen)
successful reestablishing of the connection
except timeouts that trigger in the applications, everything works again
The question(s):
Why is there no retransmission of the two lost packets?
There should be one of two mechanisms to be triggered - fast-retransmission or retransmission timeout (RTO). Am i missing something !?
Is this a bug?
It's happening quite randomly on multiple VMs on multiple ESXi hosts and not on all VMs of a ESXi host. But up to now only on wheezy. We still have some squeeze VMs that don't have the problem (yet?).
Perhaps it's really a bug since we use wheezy longer than we have the problem (or have discovered it). But i don't know how to debug it.
I hope someone can help me here. I already dug deep into TCP and lost my way a little bit
Regards
Daniel
Last edited by TheTuxKeeper; 08-22-2014 at 04:11 AM.
sorry, I forgot to write the solution here. But I found the ticket in our internal issue tracker!
It was vmxnet3 driver bug (I think something with the offloading options).
Our solution was to switch from the vmware tools to the open-vm tools. For linux it's an official recommendation of vmware. They open sourced the drivers and it's in the official kernel for some time now.
Uninstall the vmware tools completely and check that all modules are removed (they should be in /lib/modules/<kernel-version>/extra/, only the ones in /lib/modules/<kernel-version>/kernel/ should remain). Then install the open-vm tools, there should be an official package for the vmtoolsd daemon in debian and in most other distributions (drivers are already in the kernel and just overruled by the vmware tools installation)
EDIT: don't forget to rebuild the initramfs! The old driver could still be there (update-initramfs)
I hope that helps to your fix your issue!
Regards
Daniel
Last edited by TheTuxKeeper; 07-22-2017 at 09:16 AM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.