Debian Squeeze: TCP stops working, UDP doesn't, with "unexpectedly shrunk window"
Hi,
For some time I have an issue which finally made me angry enough to get it over. Suddenly, web browsers, curl and wget stop working (I mean refuse to load web-pages), but BitTorrent and ping continue to work. That's why I conclude that TCP doesn't work but UDP do work, please tell me if I'm wrong. In most but not all cases this is accompanied by messages in dmesg: Code:
[114481.188607] TCP: Peer 192.162.164.1:33760/60908 unexpectedly shrunk window 3159965547:3159965552 (repaired) "/etc/init.d/networking restart" doesn't help, as well as reconnecting with NetworkManager. The only thing that I found to be helpful is rebooting my laptop. I have Internet connection over Wi-Fi router. Other machines connected to my router do not suffer this issue. If you need any other info to analyse my problem, tell me and I will write it down as soon as this happens again. My questions are: what is this, how do I secure myself from this, and how to regain Internet connection without rebooting whole system. |
The kernel encountered a problem because the remote site changed its advertised window size without any reason. The kernel fixed this all by itself. It's a message of the informational level, not a warning. Watching wireless-tools output and saving packet captures (something like 'tcpdump -i [DEVICE] -s 0 -n -nn -N -w /path/to/file') may (or may not) show clues. To me Wifi always came across as rather fragile.
|
You simply run into tcp memory pool issues. How much memory you have? 64MiB? You never should run in such problem, it indicates, that memory pool for tcp is exhausted and kernel start aggressively cut TCP connections. Maybe you rise the limits, but be very careful - do not add more then 25% to those limits, as it is stated in pages, not bytes!
You can try to change congestion control to less aggressive one like westwood or illinois Code:
modprobe tcp-westwood Westwood is actually best congestion control for wireless, but it is cutting TCP window more aggressively, then illinois, which is made for wireless as well (but not as target) P.S. I never have run into such problem. My 2 servers, desktop and laptop are constantly on, there was time, when KTorrent was running on desktop 24/7/365 and I never had even slightest problem. |
unSpawn, thanks! It was my blind guess that those messages point directly to the problem. I'm quite a newbie and all I knew was "dmesg | tail".
WizadNoNext, thank you, too! I will learn about congestion control and tcp pool. That's the kind of answer I wanted to hear -- something to start with. Because I didn't know where to dig. I have relatively modern laptop with 3 GB of RAM. So no lack of memory here. May it be caused by buggy BitTorrent client? I use qBittorrent. Many thanks, again. |
I'll not mark this thread as SOLVED yet, untill I definitely know that problem is solved. And surely will post when it happen. Meanwhile any guesses are still welcome.
|
Quote:
|
Then with such big amount of memory, you should get quite fair amount of memory for tcp.
For me it is (tcp_mem): Code:
48276 64370 96552 Description: Quote:
You could have it bigger. Another set of parameters is tcp_wmem (but it shouldn't be a problem in your case). My (automatic) settings are: IT IS in bytes! Code:
4096 16384 2059840 Quote:
Settings are unchanged (set by kernel) on 2GiB RAM home server. unSpawn: it is just guess, but look closely. TCP suddenly dies and won't work any more. If you know any other explanation... |
Quote:
|
unSpawn
What it could be then? I am actually quite curious about this problem. My guess do not explain problems with unexpected window shrinks, but it could be up to other side of connection due to lost packets. |
Quote:
First of all you should establish a baseline, meaning the OP should provide details about the distribution (kernel), network stack information ('sysctl net.ipv4'), network device configuration (wherever that resides) and an indication if any sysctls were tweaked. Second you observe the OP trying to load web pages and failing so when the situation arises he could first run 'dmesg' to list messages, run 'iwconfig' (or whatever tool in the wireless-tools package exposes the most information) in a loop to list changing network details and start 'tcpdump' to save traffic. With that in place he should then run network diagnostics and since, as he said, 2 out of 3 IP suite protocols seem to work, running 'tcptraceroute' (and not plain traceroute) and retrieving a page with 'curl' could help gather enough information for you to run the packet capture he might share through Wireshark. |
Code:
# uname -r Code:
# sysctl net.ipv4 Code:
# cat /proc/net/sockstat Code:
( grep sharedavail /proc/slabinfo|tr -d '#'; grep -i tcp /proc/slabinfo; grep -i udp /proc/slabinfo ) | column -t; |
Quote:
Quote:
Quote:
|
Quote:
Quote:
Today I ran into this problem again. Here is the info I managed to gather: Code:
# ifconfig wlan0 But one more detail now. When I plugged in Ethernet cable and turned off WiFi card, the connection was not regained. The symptoms remained the same as it appeared to me: curl, wget and browsers do not work, and ping, BitTorrent or ICQ client do work. So I wonder if this problem really has something to do with wireless connection. Well, that is again just a blind guess, because there is no reason why WiFi can't cause some problem that could not be solved by simply turning off the WiFi or turning on Ethernet. |
Answer is simple - /proc/sys/net/ipv4 is directory, not file! You cannot set directory nor get its value.
For instance Code:
sysctl net.ipv4.tcp_rmem Code:
[sysctl -a | grep ipv4 Code:
sysctl -a | grep ipv4 | less Code:
sysctl -a | grep net | less Actually it seams that TCP is getting overloaded and either it drops everything or it simply stops to work. I was trying to work out, which module is responsible for TCP, but either I was to lazy or it is compiled into kernel. If it is compiled into kernel and would crash, then you have no other choice, then reboot, as there would be no fix. I just checked Makefile and it is build-in without option to make module. So somehow you TCP stack dies (crashes) and then only option is to reboot. It should never happen! Maybe try to get linux kernel 3.2.13 or 3.3 and see if it would happen again. BTW what version of kernel you are running, maybe there is some bug and you run into it. P.S. I have two servers, when I had just one I had all services there. I never had any problem and I can assure you, that from time to time I overloaded both TCP and UDP (FTP, NFS, samba, proxy, DNS, at least 3 SSH connections always running, copying (using FTP, NFS, samba, SSH), sometimes compiling few programs at once (at most 4 kernels with sources on server and compiling process on desktop)) - I never run into such problem - something is terribly wrong with either your usage or your connection or your kernel. It should never happen - kernel should be able to counter-fight such problems, before they would arise to being serious. |
Quote:
Code:
# uname -r Must confess, one time I was trying to learn traffic analyzing tools like wireshark, but soon ran out of leisure time and gave up. Maybe I broke something while configuring thoughtlessly wireshark, etc.? Here's the output of sysctl -a | grep net.ipv: http://pastebin.com/pT0a2UgX |
All times are GMT -5. The time now is 11:08 AM. |