Ok. This is a tough one, and no one has responded in three hours, so I'll take a stab at it. I may be wrong on some of the technical details and would welcome comments and especially corrections. I'm working on the premise that a bad answer may be better than no answer at all.
Quote:
|
Hi, I have a socket connection where there are some problems with packet lost. The client sends a packet to the server and if the packet does not arrive the destination the client does not send a second packet as I imagine the tcp\ip should assure.
|
I have looked at hundreds (possibly thousands) of Sniffer and tcpdump traces and have never seen a situation where tcp has failed to retransmit when an ack was late.
Quote:
|
Reading the man tcp I realized that the parameter "tcp_retries2" controls the time the client takes to detect the connection lost
|
That is not entirely accurate. "tcp_retries2" controls the number of retransmissions that will take place before tcp will inform the application that an error has occurred, not the amount of time. Typically the error reported is ETIMEDOUT. And at that point the connection is not lost.
tcp/ip is often referred to as a "self tuning" protocol. What is meant by that is that it keeps track of various metrics, like packet round-trip time,
receive window size, network congestion, etc.
In terms of detecting packet loss, the TCP/IP Stack that is used in Linux uses an algorithm to constantly calculate the smoothed round-trip time (srtt) for data that it is sending on an established connection and use this as the basis for the Retransmission Timeout (RTO). When tcp determines that the RTO has been exceeded, it retransmits. An "exponential backoff" is applied to the RTO meaning the RTO for subsequent packets is multiplied by 1, then 2, then 4, doubling up to 64. So, on a connection with a 10ms RTO, you would see timeouts of 10ms, 20ms, 40ms, 80ms, 160ms, 320ms, etc. When you get to "tcp_retries2" retries, it considers the error permanent and tells the app.
Note that until you hit the maximum number of retries, your app knows nothing about the retransmissions. Also bear in mind that when the error is finally reported to the application, the connection is still established. The application can choose to close the connection or it can keep trying that I/O operation.
Quote:
|
I disconnected the ethernet cable from the connection and then I sent the packet via client, then I awaited some time, short enough for the terminal does not recognize the lost connection and long enough to the client does not be able to send the packet. So the client never sends the packet again.
|
Unless you have a really high RTO and/or you are transmitting over a verrrrry slow connection (300bps modem comes to mind), you can't unplug and plug a cable fast enough...
Have you verified this in a 'tcpdump' on the client end.
I suspect this is a problem with your client code failing to properly handle errors that have been reported to it. Otherwise, it's a serious error in the tcp stack and should be affecting a lot of people.
Quote:
|
I want a configuration where the client detects the connection lost or sends a second packet if the connection is restarted fast enough.
|
Again, the connection is not lost and the tcp/ip stack should be retransmitting automatically. The connection is not lost until the client decides it is and closes the connection.
Quote:
|
Is there a way to determine the ack timeout?
|
I don't think so. But I think it is roughly twice the round-trip time reported by ping.
Quote:
|
Is the tcp_retries1 that controls the number of times a packet is sent?
|
No, 'tcp_retries2' controls that. 'tcp_retries1 controls how many retransmissions are done before it checks to see if t should retry using a different route (i.e. send the packet out a different interface or send it to a different router).
To go further with this I think a tcpdump (filtering on just your client) and a Wireshark summary print would be helpful.