Hello, I will try to keep this post small,but I will describe all my scenario...cause I truly find myself quite lost here.
I have one Linux desktop running Fedora Core 15, it is connect by Ethernet and has address 'A'. This desktop is dual boot and has also windows XP that uses the same address.
I have 3 proxy servers, all running CentOS 5.5 with SSH enabled and running, with ip addresses proxyip1, proxyip2, proxyip3.
There are no firewalls configured in any of the servers neither on my desktop environments.
I make the following test in my fedora core:
it shows the OPENSSH header I click a few times in my keyboard and it disconnects has expected..everything ok here;
it shows the OPENSSH header I click a few times in my keyboard and it disconnects has expected..everything ok here also;
nothing happens it stays trying trying until it just gives gives up.
the thing is...it doesn't happen all the time, just "sometimes..." the main problem isn't even with ssh, we just noticed that some of our Linux desktops where extremely slow accessing the INTERNET, it happens because all this 3 proxy's load balance our INTERNET connections, and Linux desktops where slow so slow that I decided to debug a little bit... and I noticed that during the times they were very slow we couldn't access some of the proxy services, and then noticed that during those times not just the proxy service was unavailable, but any tcp connection to some of the proxy servers was just impossible. However make a ping to the very same server and it works.... and no firewall is configured!
The most strange thing is...this doesn't happen always with this server...sometimes it happens with any of the other two. Even stranger is.... I reboot my desktop and go to my windows xp.... it works great... never ever fails... even stranger... I reboot again... go back to my Linux desktop, repeat the test and it fails has described.. I launch virtual box, I have a virtual machine with windows xp in it... I configure another a new address for it and put the network in bridge mode with my ethernet card...the very same the Linux host is using..In the Fedora it stills fails... inside my virtual box with windows xp installed it WORKS.
At first I thought it was a problem within the network...the packets are being blocked from reaching the server somewhere...maybe the load balancing protocol (WCCP) is causing this....but no... I go to all the proxys and execute:
Code:
tcpdump not host anotheripfromwhereimconnecting and tcp port 22
so that I can see all communications to port 22 and... well...packets enter all the servers...but in the "problematic server of the moment" nothing happens after they enter... nothing... no packet is outputted.. if I saw the packet leaving the server...maybe it could be somewhere in the network..some router blocking it...but an answer never leaves the server...
Output of tcpdump in a server that is working:
Code:
10:28:46.049975 IP myip.50284 > serverip.ssh: S 1027432837:1027432837(0) win 14600 <mss 1460,sackOK,timestamp 6735462 0,nop,wscale 7>
10:28:46.050017 IP serverip.ssh > myip.50284: S 2399763698:2399763698(0) ack 1027432838 win 5792 <mss 1460,sackOK,timestamp 2494826769 6735462,nop,wscale 8>
10:28:46.051098 IP myip.50284 > serverip.ssh: . ack 1 win 115 <nop,nop,timestamp 6735463 2494826769>
10:28:46.051121 IP myip.50284 > serverip.ssh: R 1:1(0) ack 1 win 115 <nop,nop,timestamp 6735463 2494826769>
Output of tcpdump in a server that is experiencing the problem:
Code:
10:28:17.521801 IP myip.50567 > serverip.ssh: S 581526549:581526549(0) win 14600 <mss 1460,sackOK,timestamp 6706958 0,nop,wscale 7>
10:28:17.621496 IP myip.50568 > serverip.ssh: S 572792424:572792424(0) win 14600 <mss 1460,sackOK,timestamp 6707058 0,nop,wscale 7>
10:28:26.527413 IP myip.50580 > serverip.ssh: S 722321451:722321451(0) win 14600 <mss 1460,sackOK,timestamp 6715964 0,nop,wscale 7>
10:28:26.627455 IP myip.50581 > serverip.ssh: S 713286386:713286386(0) win 14600 <mss 1460,sackOK,timestamp 6716064 0,nop,wscale 7>
10:28:35.538137 IP myip.50593 > serverip.ssh: S 858993821:858993821(0) win 14600 <mss 1460,sackOK,timestamp 6724974 0,nop,wscale 7>
10:28:35.637897 IP myip.50594 > serverip.ssh: S 852852340:852852340(0) win 14600 <mss 1460,sackOK,timestamp 6725074 0,nop,wscale 7>
10:28:44.556576 IP myip.50606 > serverip.ssh: S 1005758577:1005758577(0) win 14600 <mss 1460,sackOK,timestamp 6733993 0,nop,wscale 7>
10:28:44.656551 IP myip.50607 > serverip.ssh: S 995859375:995859375(0) win 14600 <mss 1460,sackOK,timestamp 6734093 0,nop,wscale 7>
I've captured packets in intermediate routers, to try and see any difference between a packet generated by a Linux host and a packet generated by a windows host... was expecting to see some problem in TTL's, MSS, MTU etc... nothing....everything "looks" the same.
Can anyone give a hint in where to go from here? Can I check if the packet is discarded by the Linux server kernel somehow? loosing my mind here... please help
This is apparently not a new problem in this network...when I arrived here and said I was having trouble accessing the INTERNET other people in here already knew this could be happening...
Any help will be greatly appreciated!
Thank you and sorry for the long post