Visit Jeremy's Blog.
Go Back > Forums > Linux Forums > Linux - Networking
User Name
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.


  Search this Thread
Old 08-21-2014, 05:48 AM   #1
LQ Newbie
Registered: Aug 2014
Posts: 2

Rep: Reputation: Disabled
Question Weird retransmission problem with NFv3 over TCP between debian wheezy and Netapp


i try to understand a problem we have.
Our setup:
  • Debian wheezy VM (on multiple ESXi 5.0 hosts) with vmxnet3 NICs and NFS rootfs (based on LTSP) and other NFS mounts
  • NFS server: Netapp Ontap 8.1.3P3 (rootfs and other NFS mounts of the clients from this IP, so we usually have only two connections but more mounts)
  • 10GBASE network: ESXi hosts with CX4 (newer ones with Cat6), Netapp with fiber
  • all in the same network (VLAN), no router
  • MTU is 9000 on client and Netapp
  • nfs mount options:

The problem (also see wireshark csv export captured on the nfs-client
  1. two packets of the NFS connections get lost (reason yet to find) - 4251210111 and 4251219059
  2. Netapp sends duplicate ACK with the sequence number of first lost packet in ack field (as far as i know this should trigger a fast-retransmit) - seq 1681885773+ and ack 4251210111
  3. more write requests from the client, ACKed with SACK by Netapp (ACK field still the sequence number of lost packet) - seq 1681885773 onwards
  4. 12s nothing - client-VM frozen since nearly all filesystem operations hang (only two connections that handle all the NFS traffic and there are no other filesystems used except tmpfs)
  5. SYN from client (trying to reestablish the connection)
  6. another ACK from netapp with ack 4251210111 (still waiting for a retransmit)
  7. RST from client (wants to reestablish the connection)
  8. 73s nothing (client-VM still nearly completely frozen)
  9. successful reestablishing of the connection
  10. except timeouts that trigger in the applications, everything works again

The question(s):
Why is there no retransmission of the two lost packets?
There should be one of two mechanisms to be triggered - fast-retransmission or retransmission timeout (RTO). Am i missing something !?
Is this a bug?

It's happening quite randomly on multiple VMs on multiple ESXi hosts and not on all VMs of a ESXi host. But up to now only on wheezy. We still have some squeeze VMs that don't have the problem (yet?).
Perhaps it's really a bug since we use wheezy longer than we have the problem (or have discovered it). But i don't know how to debug it.

I hope someone can help me here. I already dug deep into TCP and lost my way a little bit


Last edited by TheTuxKeeper; 08-22-2014 at 04:11 AM.
Old 07-21-2017, 02:23 PM   #2
LQ Newbie
Registered: Jul 2017
Posts: 1

Rep: Reputation: Disabled
Hey TuxKeeper.
I am seeing this exact behavior. Did you ever figure out what it was?
Old 07-22-2017, 09:13 AM   #3
LQ Newbie
Registered: Aug 2014
Posts: 2

Original Poster
Rep: Reputation: Disabled

sorry, I forgot to write the solution here. But I found the ticket in our internal issue tracker!
It was vmxnet3 driver bug (I think something with the offloading options).

Our solution was to switch from the vmware tools to the open-vm tools. For linux it's an official recommendation of vmware. They open sourced the drivers and it's in the official kernel for some time now.
Uninstall the vmware tools completely and check that all modules are removed (they should be in /lib/modules/<kernel-version>/extra/, only the ones in /lib/modules/<kernel-version>/kernel/ should remain). Then install the open-vm tools, there should be an official package for the vmtoolsd daemon in debian and in most other distributions (drivers are already in the kernel and just overruled by the vmware tools installation)
EDIT: don't forget to rebuild the initramfs! The old driver could still be there (update-initramfs)

I hope that helps to your fix your issue!


Last edited by TheTuxKeeper; 07-22-2017 at 09:16 AM.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Unexpected TCP Retransmission. $ubbu Linux - Networking 10 06-01-2012 03:09 PM
ssh hangs upon tcp errors or retransmission wastingtime Linux - Software 1 03-10-2009 09:13 PM
TCP Retransmission in Information tab of Ethereal Linuxfan0001 Linux - Networking 1 06-18-2008 04:28 PM
TCP Retransmission & lost segments problem under Linux but not under XP debuser123 Linux - Networking 22 12-16-2007 05:34 PM
TCP retransmission and duplicated ack enjoyzj Linux - Networking 0 06-05-2004 07:19 PM > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 11:38 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration