LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   TCP connections stall after a while on FC5 (https://www.linuxquestions.org/questions/linux-networking-3/tcp-connections-stall-after-a-while-on-fc5-481289/)

Pepijn Schmitz 09-07-2006 10:10 AM

TCP connections stall after a while on FC5
 
I'm having a very strange problem with a Fedora Core 5 (FC5) box I set up recently as a MythTV frontend: almost all outgoing TCP connections stall after transferring some data (sometimes just a few kilobytes, sometimes after several megabytes)! It does not seem to matter where I'm connecting to. Connections to my local MythTV backend fail just as predictably as connections to, for instance, the Fedora update sites. Obviously this is extremely annoying, as it makes it impossible to use the box for its intended purpose: MythTV frontend. After a couple of seconds, every connection just stutters and dies...

Even more strangely, if I then start a ping -A 10.0.0.1 (where 10.0.0.1 is my router) in a separate window, the connection unstalls and continues, although slower, even if the stalled connection was to my MythTV backend (which doesn't go through the router). This workaround enables me to at least use the box somewhat normally, but obviously it's not ideal and should not be necessary. I've been Googling it, but I can't find any recent information about problems with stalling TCP connections in Linux.

My setup looks like this:

Code:

                      +----------------+
                      | MythTV Backend |
                      | (FC5 box)      |
                      +----------------+
+--------+                    |      +------------+  +-------+
| MythTV |  +-----+      +-----+  | NAT Router |  | Cable |
|(FC5    |---| Hub |      | Hub |---| / Firewall |---| Modem |
| box)  |  +-----+      +-----+  | (FC3 box)  |  +-------+
+--------+      |            |      +------------+      |
          +----------+  +--------+                ************
          | Wireless |...| Access |                * Internet *
          | Bridge  |  | Point  |                ************
          +----------+  +--------+

The MythTV box is the one having the problem. It does not seem to matter whether the connection is to the Internet somewhere, or to my local MythTV backend, so I do not think the problem is with my MythTV Backend or router. Previously I had installed Ubuntu 6.06 on the MythTV box, and it did not have this problem, nor did older versions of Fedora Core, so I don't think it is a hardware problem either, or a problem with my network.

The network configuration on all the boxes is completely standard. I've made no changes to any of the settings in /prox/sys/net/ipv4 (except to test potential solutions, but none of them worked so I reversed them all). I suspect that some kind of problem in my Linux kernel is being triggered by some oddity in my network, or by one of the other Linux kernels. I'm at work now, so I can't check the exact kernel versions, but all the boxes should be completely up-to-date.

I hope all this rings a bell with someone here! Can anybody help me figure this out? Or point me to some good, detailed resources about how to go about troubleshooting this problem? Many thanks in advance!

Kind regards,
Pepijn

jcliburn 09-07-2006 08:57 PM

On the FC5 boxes, try executing this command as root, then retry your connections:

systcl -w net.ipv4.tcp_window_scaling=0

Pepijn Schmitz 09-09-2006 07:45 PM

Quote:

Originally Posted by jcliburn
On the FC5 boxes, try executing this command as root, then retry your connections:

systcl -w net.ipv4.tcp_window_scaling=0

That seems to have done the trick, thanks! Which is weird, because I thought I already tried that, only I did it by echoing 0 to /proc/sys/net/ipv4/tcp_window_scaling (as root). Is there a difference?

Anyway, thanks a lot for helping me out!

Can you tell me how I can make this permanent, so I don't have to do it each time I boot the machine?

Cheers,
Pepijn

jcliburn 09-09-2006 07:57 PM

Echo should've done the same thing.

echo 0 > /proc/sys/net/ipv4/tcp_window_scaling

To make it permanent, add this line to /etc/sysctl.conf:

net.ipv4.tcp_window_scaling = 0

Pepijn Schmitz 09-14-2006 04:22 PM

It seems I spoke too soon... :-(

Setting net.ipv4.tcp_window_scaling to 0 seems to have made the problem slightly smaller, but it has not gone away. Connections still stall on a regular basis, which is extremely annoying since I can't watch live TV or recordings, they all stall after anything from a few seconds to a couple of minutes. I can use the "ping -A <router>" trick to make it work somewhat, but that's far from ideal. It floods my WLAN so various other devices suddenly can't connect to their servers anymore, and I also suspect it puts quite a strain on the Myth frontend itself, it seems to become noticably less responsive.

Does anyone know what else I could try? Or how I should go about trying to find out what the cause is?

Pepijn Schmitz 12-09-2006 08:44 AM

*Bump*

Can anyone help with the problem I describe above in the first message? Setting tcp_window_scaling to 0 unfortunately does *not* help.

sal_paradise42 12-09-2006 12:37 PM

can you get us a capture of the syn/ack exchange,and up to when the problem starts happening?
tcpdump -n -vvv -s 1518 -i <int-name>

ifconfig also will be nice

and just a hunch but can you try a ping with the df bit set:
ping -M do -s 1472 <Mythtvbox-ip>
and vice-versa

teckk 12-09-2006 03:51 PM

I've had that problem with Linux and a certain router that was set to assign IP addresses with DHCP. Just like you said random stall outs. I was issuing ifdown and ifup for a while to make it work.

I fixed it by assigning everything on the network a static IP and never had another stall.

I even stopped using Linux for a while and switched to BSD because of it until I found the answer. I don't know the cause of it. But assigning fixed IP addresses cured it. The windows box's and FreeBSD box's ran fine but the Linux machines would stall out, most annoying. I think it had to do with the router renewing leases.

I never did find out what the root cause was. let us know if you discover it. Good luck.


All times are GMT -5. The time now is 03:47 PM.