ssh frozen display - maybe missing responses
I'm having a problem with connecting over ssh to a server (wrdsvr) that has me perplexed. I'm wondering if any of you have come across this before or have any ideas. I'm using putty to connect from my WIndows VM desktop to a SLES 9 server.
If I connect to wrdsvr from my desktop over ssh and run certain commands with multi-line output, the display freezes after the first line. If I connect from my desktop to a different server (oksvr), and then from oksvr I connect to wrdsvr, then there is no problem. In fact, by running 'w' after connecting in that roundabout way I can see that subsequent commands I type into the frozen window still run - I just can't see anything in the window itself as the display is frozen. I have sshd logging running in debug mode on wrdsvr and there is nothing produced during this. There is also nothing in the putty event log. If I type 'exit' in the frozen window, the server sees the connection as closing normally and then gone. Usually my putty window would then close automatically, but in this frozen case it doesn't. So although it is sending characters I type in, it doesn't seem to be receiving the output in return.
commands that run successfully are:
commands that cause the display to freeze are:
top (for this one I don't even get the first line of output, it freezes immediately)
The machine I'm connecting from is a VMFusion guest running Windows XP. I get this behavior connecting using putty, but I also installed a demo version of securecrt (when this issue occurred previously) which saw the same problem, but I can't repeat it as my license expired. (Last time the issue went away while I was troubleshooting an immediate service-affecting problem on that and a number of other servers and I don't know what fixed it!) I exported the putty registry keys and the profiles for the two servers are identical. I tried loading the profile for oksvr and temporarily changing the hostname to wrdsvr, but saw the same issue. I am connecting over a Cisco VPN. My colleague is on the local network and does not see this issue when he connects to wrdsvr using putty. We are both using the same version of putty 0.60.
Here is the background on the servers. Both wrdsvr and oksvr are running SLES 9. My actions just before I noticed these issues were the following. I updated them using you (yast online update) to the latest patch versions. Using the rpms from Novell, I installed binutils, make, gcc, and glibc-devel and finally VMware tools on both. I then rebooted. Since then I've run you again but that hasn't changed anything. I've compare the installed patches using diff and they are the same. Now I'm working my way through the output of rpm -qVa on each one, but nothing so far.
Thanks for your time,
I had the same symptoms with SLES9 x86_64 guests... ended up being TSO (TCP Segment Offload), try disabling it using ethtool and retest. If it is the same problem, I can post a script I used to clean up the network configs
'ethtool -k ethX' to display offload status
'ethtool -K ethX tso off' to disable TSO
That's it! Thank you so much! Sure enough:
#ethtool -k eth0
Offload parameters for eth0:
tcp segmentation offload: on
and after I turned if off the problem went away, and when I turned it back on the problem came back.
Here's the script just in case you have a few hosts to do:
Thank you for the script! I do intend to turn this off on a lot of other VM guests.
A quick update in case anyone else has a similar issue and finds this in a search. I found that tso was being turned back on after a reboot, and this was being done by VMware tools. When VMware tools tried to start tso on our other similar servers it didn't start because it was not supported. Only on this one server was tso actually starting.
I finally tracked this to the source - the server had been configured as SLES 64-bit on the VMware host even though it was actually running SLES9 32-bit. As a result the VMware host was giving it the wrong type of virtual network card (e1000 instead of flexible - see http://kb.vmware.com/selfservice/mic...rnalId=1001805). I corrected the configuration and deleted the old NIC and created a new one of the correct type. Now that it has the right network card it doesn't matter that VMware tools tries to start tso because it is not supported.
I also saw errors in VMware tools. We had been using the tarball provided by the host to install the tools. I replaced them with the VMware Tools Operating System Specific Packages (OSPs) - see http://www.vmware.com/download/packages.html. Subsequently the vmxnet NIC was able to run correctly.
|All times are GMT -5. The time now is 08:37 AM.|