LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices

Reply
 
Search this Thread
Old 11-26-2009, 07:51 PM   #1
ponga
LQ Newbie
 
Registered: Apr 2009
Posts: 17

Rep: Reputation: 0
Bizarre TCP connectivity issues from certain clients, totally mystified!


Greetings all. I've hit a wall here and simply cannot put the dots together on this problem. Any help is appreciated!

I have a couple websites that I host, recently deployed. I have gotten reports and confirmed myself that attempting to navigate my sites, the connections are timing out! This is only happening from a *certain subset* of users. Everyone else has no complaints. So I am trying to figure out what the common ground is here to try and fix it, but have hit a wall in doing so. I'm out of ideas.

The setup: This is kinda long, my apologies:
*Qwest DSL (2Mbps/768kbps) PPoE using Actiontec GT701R, static IP, ISP blocks no ports or does not shape traffic (so they say).
*Debian Lenny firewall, iptables, 2.6.26 kernel, rp-ppoe driver, running NAT/MASQ.
*Web server resides on VmWare Server 2.0.2, bridged network.
*Web server is Debian Lenny, Apache2 (latest .deb).

I should note that I also use this connection for my recreational Internet - it's perfectly fine. I'm getting the speed as advertised and nitro.ucsc.edu reports no errors, nor have I had any issues with connectivity whatsoever.

*The working clients: Working clients are Windows IE & Firefox, some Linux and some PPC Mac's (OSX) - have reported no issues, I can confirm this.

*The NOT working clients are: Some PPC Mac's (OSX Camino, Firefox and Safari fail on a couple machines) and my mobile phone (Samsung with AT&T) fails 100% and 50% of the time, my Linux (latest CentOS) fails from work using either Opera or Firefox.

The funny thing is, I've tested is from a cable connection - a Windows 7 + Firefox works fine, on the same connection, a PPC Mac fails.

*The behavior on the failing clients is, you can load the main page, but subsequent links fail. http POST's *usually fail*... some pages works, but most don't. The result on the client side is a network timeout. The result on the server side (apache logs) is, I never even see the request for which is timing out on the client side!

So, this is a TCP issue. After looking at packet captures and seeing it for myself, it has to be a TCP issue. (But I'm not 100% sure.)

I took some packet captures.. (text below) but I can't make heads or tails of it. It APPEARS there is packet loss. Even then, I would suppose TCP could handle this. I'm not seeing any TCP resets. All I can tell is... the server and client... just get OFF the same page. I don't know how else to describe it.

*Anyway, there has to be a solution to this. But I'm just not seeing the forest because of the trees. Any help is MUCH appreciated!

I have legit pcap's and you can have the url's if you like, just ask. And seriously THANKS. This is driving me crazy...

* This is a capture using my samsung mobile phone as the test client. I have the other captures for the other failing clients (Mac, etc) They look essentially the same.

Again, TIA! --ponga

------------------------------------------------------------------------------------
Code:
No.     Time        Source                Destination           Protocol Info
      5 46.470893   client_ip_addr         server_ip_addr         TCP      24656 > http [SYN] Seq=0 Win=33580 Len=0 MSS=1460 WS=0
      6 46.471066   server_ip_addr         client_ip_addr         TCP      http > 24656 [SYN, ACK] Seq=0 Ack=1 Win=5840 Len=0 MSS=1460 WS=6
      7 46.559154   client_ip_addr         server_ip_addr         TCP      24656 > http [ACK] Seq=1 Ack=1 Win=33580 Len=0
      8 46.562949   client_ip_addr         server_ip_addr         HTTP     GET / HTTP/1.1 
      9 46.563244   server_ip_addr         client_ip_addr         TCP      http > 24656 [ACK] Seq=1 Ack=1254 Win=8384 Len=0
     10 46.564246   server_ip_addr         client_ip_addr         HTTP     HTTP/1.1 301 Moved Permanently  (text/html)
     11 46.660908   client_ip_addr         server_ip_addr         TCP      24656 > http [ACK] Seq=1254 Ack=641 Win=33580 Len=0
     12 47.748387   client_ip_addr         server_ip_addr         TCP      24656 > http [FIN, ACK] Seq=1254 Ack=641 Win=33580 Len=0
     13 47.748810   server_ip_addr         client_ip_addr         TCP      http > 24656 [FIN, ACK] Seq=641 Ack=1255 Win=8384 Len=0
     14 47.748856   client_ip_addr         server_ip_addr         TCP      26027 > http [SYN] Seq=0 Win=33580 Len=0 MSS=1460 WS=0
     15 47.749085   server_ip_addr         client_ip_addr         TCP      http > 26027 [SYN, ACK] Seq=0 Ack=1 Win=5840 Len=0 MSS=1460 WS=6
     16 47.836894   client_ip_addr         server_ip_addr         TCP      24656 > http [ACK] Seq=1255 Ack=642 Win=33580 Len=0
     17 47.838601   client_ip_addr         server_ip_addr         TCP      26027 > http [ACK] Seq=1 Ack=1 Win=33580 Len=0
     18 47.842919   client_ip_addr         server_ip_addr         HTTP     GET /tiki-mobile.php HTTP/1.1 
     19 47.843194   server_ip_addr         client_ip_addr         TCP      http > 26027 [ACK] Seq=1 Ack=1325 Win=8768 Len=0
     20 48.143480   server_ip_addr         client_ip_addr         HTTP     HTTP/1.1 200 OK  (application/vnd.wap.xhtml+xml)
     21 48.251028   client_ip_addr         server_ip_addr         TCP      26027 > http [ACK] Seq=1325 Ack=1356 Win=33580 Len=0
     22 54.504363   client_ip_addr         server_ip_addr         HTTP     GET /tiki-list_articles.php?mode=mobile HTTP/1.1 
     23 54.504552   server_ip_addr         client_ip_addr         TCP      http > 26027 [ACK] Seq=1356 Ack=2764 Win=11712 Len=0
     24 55.264068   server_ip_addr         client_ip_addr         TCP      [TCP segment of a reassembled PDU]
     25 55.372478   client_ip_addr         server_ip_addr         TCP      26027 > http [ACK] Seq=2764 Ack=2808 Win=32128 Len=0
     26 55.372692   server_ip_addr         client_ip_addr         TCP      [TCP Previous segment lost] [TCP segment of a reassembled PDU]
     27 55.372772   server_ip_addr         client_ip_addr         TCP      [TCP segment of a reassembled PDU]
     28 55.482368   client_ip_addr         server_ip_addr         TCP      [TCP Dup ACK 25#1] 26027 > http [ACK] Seq=2764 Ack=2808 Win=32128 Len=0 SLE=2816 SRE=4268
     29 55.482592   server_ip_addr         client_ip_addr         TCP      [TCP Retransmission] [TCP segment of a reassembled PDU]
     30 55.482664   server_ip_addr         client_ip_addr         TCP      [TCP Retransmission] [TCP segment of a reassembled PDU]
     31 55.482723   server_ip_addr         client_ip_addr         TCP      [TCP segment of a reassembled PDU]
     32 55.483827   client_ip_addr         server_ip_addr         TCP      [TCP Dup ACK 25#2] 26027 > http [ACK] Seq=2764 Ack=2808 Win=32128 Len=0 SLE=2816 SRE=4276
     33 55.483979   server_ip_addr         client_ip_addr         TCP      [TCP segment of a reassembled PDU]
     34 55.570331   client_ip_addr         server_ip_addr         TCP      26027 > http [ACK] Seq=2764 Ack=4276 Win=32128 Len=0
     35 55.571527   client_ip_addr         server_ip_addr         TCP      [TCP Dup ACK 34#1] 26027 > http [ACK] Seq=2764 Ack=4276 Win=32128 Len=0
     36 55.571575   server_ip_addr         client_ip_addr         TCP      [TCP segment of a reassembled PDU]
     37 55.592991   client_ip_addr         server_ip_addr         TCP      26027 > http [ACK] Seq=2764 Ack=5728 Win=30676 Len=0
     38 55.593186   server_ip_addr         client_ip_addr         TCP      [TCP segment of a reassembled PDU]
     39 55.680007   client_ip_addr         server_ip_addr         TCP      26027 > http [ACK] Seq=2764 Ack=7188 Win=32128 Len=0
     40 55.688136   server_ip_addr         client_ip_addr         HTTP     HTTP/1.1 200 OK  (application/vnd.wap.xhtml+xml)
     41 55.772368   client_ip_addr         server_ip_addr         TCP      26027 > http [ACK] Seq=2764 Ack=7196 Win=33580 Len=0
     42 55.876368   client_ip_addr         server_ip_addr         TCP      26027 > http [ACK] Seq=2764 Ack=7208 Win=33580 Len=0
     43 67.750772   client_ip_addr         server_ip_addr         HTTP     [TCP Previous segment lost] Continuation or non-HTTP traffic
     44 67.750963   server_ip_addr         client_ip_addr         TCP      [TCP Dup ACK 40#1] http > 26027 [ACK] Seq=7208 Ack=2764 Win=11712 Len=0 SLE=4224 SRE=4235
     45 69.891559   server_ip_addr         client_ip_addr         TCP      http > 26027 [FIN, ACK] Seq=7208 Ack=2764 Win=11712 Len=0 SLE=4224 SRE=4235
     46 69.979327   client_ip_addr         server_ip_addr         TCP      26027 > http [ACK] Seq=4235 Ack=7209 Win=33580 Len=0
     47 69.979766   client_ip_addr         server_ip_addr         TCP      26027 > http [FIN, ACK] Seq=4235 Ack=7209 Win=33580 Len=0
     48 69.979883   server_ip_addr         client_ip_addr         TCP      [TCP Dup ACK 45#1] http > 26027 [ACK] Seq=7209 Ack=2764 Win=11712 Len=0 SLE=4224 SRE=4236
     49 69.980030   client_ip_addr         server_ip_addr         TCP      47079 > http [SYN] Seq=0 Win=33580 Len=0 MSS=1460 WS=0
     50 69.980161   server_ip_addr         client_ip_addr         TCP      http > 47079 [SYN, ACK] Seq=0 Ack=1 Win=5840 Len=0 MSS=1460 WS=6
     51 70.069512   client_ip_addr         server_ip_addr         TCP      47079 > http [ACK] Seq=1 Ack=1 Win=33580 Len=0
     52 70.070192   client_ip_addr         server_ip_addr         HTTP     [TCP Previous segment lost] Continuation or non-HTTP traffic
     53 70.070347   server_ip_addr         client_ip_addr         TCP      [TCP Window Update] http > 47079 [ACK] Seq=1 Ack=1 Win=5888 Len=0 SLE=1461 SRE=1499
     58 132.556233  client_ip_addr         server_ip_addr         TCP      47079 > http [FIN, ACK] Seq=1499 Ack=1 Win=33580 Len=0
     59 132.556518  server_ip_addr         client_ip_addr         TCP      [TCP Dup ACK 53#1] http > 47079 [ACK] Seq=1 Ack=1 Win=5888 Len=0 SLE=1461 SRE=1500
------------------------------------------------------------------------------------
end.
 
Old 11-26-2009, 08:34 PM   #2
nimnull22
Senior Member
 
Registered: Jul 2009
Distribution: OpenSuse 11.1, Fedora 14, Ubuntu 12.04/12.10, FreeBSD 9.0
Posts: 1,571

Rep: Reputation: 92
server_ip_addr client_ip_addr TCP http > 26027 [ACK] Seq=1356 Ack=2764 Win=11712 Len=0
server_ip_addr client_ip_addr TCP [TCP segment of a reassembled PDU]
client_ip_addr server_ip_addr TCP 26027 > http [ACK] Seq=2764 Ack=2808 Win=32128 Len=0
server_ip_addr client_ip_addr TCP [TCP Previous segment lost] [TCP segment of a reassembled PDU]
server_ip_addr client_ip_addr TCP [TCP segment of a reassembled PDU]

client_ip_addr server_ip_addr TCP [TCP Dup ACK 25#1] 26027 > http [ACK] Seq=2764 Ack=2808 Win=32128 Len=0 SLE=2816 SRE=4268
server_ip_addr client_ip_addr TCP [TCP Retransmission] [TCP segment of a reassembled PDU]
server_ip_addr client_ip_addr TCP [TCP Retransmission] [TCP segment of a reassembled PDU]
server_ip_addr client_ip_addr TCP [TCP segment of a reassembled PDU]
client_ip_addr server_ip_addr TCP [TCP Dup ACK 25#2] 26027 > http [ACK] Seq=2764 Ack=2808 Win=32128 Len=0 SLE=2816 SRE=4276


Looks like some switch or router can't menage fragmented packets.
I would suggest to check fragmentation.
May be MTU is large?

Also I suggest to use tcpdump for diagnostic purposes.
 
Old 11-26-2009, 08:59 PM   #3
gratuitous_arp
LQ Newbie
 
Registered: Jul 2009
Posts: 28

Rep: Reputation: 17
Well, have you tried using some other application to access the server while the problem is occuring (like ping)?

Can you get back to the main page after a subsequent page fails to load?

What are the sizes of the web pages you are trying to load?

Most forward-thinking operating systems set the "Don't Fragment" bit in the IP header. Check your captures to see if the problem clients do or do not (this can be done with Wireshark or verbose mode in tcpdump).
 
Old 11-26-2009, 09:20 PM   #4
ponga
LQ Newbie
 
Registered: Apr 2009
Posts: 17

Original Poster
Rep: Reputation: 0
Thanks guys for the quick replies!! Pulling my hair out over here.

nimnull22:
On the fragmentation side, ya I suppose that is possible, and it would make sense. My MTU is 1492 bytes on my DSL interface, inhouse ethernet is 1500. I have iptables doing the works there with a "clamp-to-PMTU" line.. but, maybe I have that wrong and iptables is not doing it after all or maybe 1492 is too large! I'll investigate that... And ya, I'm taking the caps with tcpdump, the output you see was just an export from Wireshark to I could easily obfuscate IP's and port it..

gratuitous_arp:
>Well, have you tried using some other application to access the server
>while the problem is occurring (like ping)?
Yes, ICMP is fine and what is even more weird, SSH is unaffected.

>Can you get back to the main page after a subsequent page fails to load?
Nope, not until the connection times out.

>What are the sizes of the web pages you are trying to load?
I was thinking that too, from the fragmentation perspective. The pages are all similar in size, but similar does not mean same. Although, I CAN get to a page, if I go there directly. So, it's not the size of the page that is the problem. But if I hit a page, maybe two, after that I try to get another page... I'm dead in the water.

I also tried turning off apache2 KeepAlive, as a shot in the dark... no dice.

Given that, what do you think??

THANK YOU guys for the SUPER FAST replies!!
--ponga
 
Old 11-26-2009, 10:24 PM   #5
ponga
LQ Newbie
 
Registered: Apr 2009
Posts: 17

Original Poster
Rep: Reputation: 0
Hah! Fragmentation it was!!! Crazy. I set my MTU on the DSL interface to 1400.. shazzam! Site works perfectly.

Strange though. I'm not blocking ICMP unreachable or fragmentation needed packets... but maybe the clients from those other networks were. Also, why SSH was not affected... Strange.

Anyway, it was your guys ideas that helped me. At least now I know what the issue was and can take appropriate action to resolve it.

THANKS!!!!!
--ponga
 
Old 11-27-2009, 08:47 AM   #6
gratuitous_arp
LQ Newbie
 
Registered: Jul 2009
Posts: 28

Rep: Reputation: 17
Glad you got things worked out. It seems there was a problem with path MTU discovery, which should handle remote sites with lower MTUs automatically. If you are curious, here is a good link explaining it:

http://www.netheaven.com/pmtu.html
 
Old 11-30-2009, 11:02 PM   #7
Smartpatrol
Member
 
Registered: Sep 2009
Posts: 196

Rep: Reputation: 38
...

Last edited by Smartpatrol; 03-11-2010 at 09:51 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Partial loss of internet connectivity (TCP window scaling?) ojbyer@usa.net Linux - Networking 5 11-30-2008 12:46 PM
IP connectivity issues BadBoatPicker Linux - Networking 2 11-30-2006 11:55 AM
Bizarre 5.1 surround issues with AD1985 AC'97 on SuSE 9.2 Sejanus Linux - Hardware 0 05-02-2005 10:05 AM
RH server and XPfee clients,no connectivity ALInux Linux - Networking 0 09-09-2004 02:50 PM
Bizarre Issues When Left Alone... AngelicCharon Mandriva 2 08-23-2004 12:51 PM


All times are GMT -5. The time now is 11:59 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration