TCP Checksum errors ... only after some amount of time has passed.
Linux - NetworkingThis forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
Something phenomenal happened. It went "out" but it came back "in". I don't know if it was caused by me or not though.
1. Three FF tabs that I had recently tried to open were still trying to load making me believe that it was happening.
2. Tried to open up some usual sites I go to in links...nope, no go, all stuck on "making connection".
yahoo.com worked for some odd reason, and I mean quickly like there were no dropped packets at all. A wireshark cap confirmed that there were no errors in the connection process from start to finish.
This got me thinking, and so I then tried random sites out. Out of about 20 sites I tried, only about 3 worked, linuxquestions.org being one of those three.
Since ping / icmp packets never had a problem in lieu of the TCP checksum errors, I tried pinging good & bad sites to see if maybe it was a TTL problem (for example good sites had a high TTL [e.g. less hops] and bad sites had a low TTL) but then I realized that would be inconclusive as I don't know the starting TTL of a received packet. So getting TTLs of 56, 120, and 240 really could be good and bad depending on the starting TTL.
The default TTL (/proc/sys/net/ipv4/ip_default_ttl) was 64....I set it to 255 for kicks, but that didn't change anything. I then changed other tcp options (set tcp_sack, tcp_timestamps, tcp_windowscaling off) which seemed like they didn't make a difference as new connection attempts at random hosts hung as well.
But somehow sites that I was testing started coming up, which seemed kind of odd 'cause my success rate was like 100%. So then I tried sites that didn't work 5 minutes before and what do you know, they now do.
Now I just have to wait for "it" to happen again so I could see if a) repeatedly trying to load yahoo.com or b) changing those /proc values fixes the problem.
could be a bug in the driver too and not bad hardware.. like I said it was easier for me to throw a NIC in the box than waste my time chasing something I couldn't control or identify.
Not that I'm doubting you because this well could be the case, but I wouldn't think so. I'm using kernel 2.6 with via-rhine and it's just fine. I actually use that as a DNS/Squid/IRC/file server. No issues.
Not to mention if it was a bug in a driver, I'm sure there would be more people screaming about it now concerning the circumstances (the computer just sits and has TCP checksum errors).
The default TTL (/proc/sys/net/ipv4/ip_default_ttl) was 64....I set it to 255 for kicks, but that didn't change anything.
Watch arbitrarily setting parameters like this. TTL is used to keep packets from living forever if they get caught in a routing loop. Not to mention, since this can be used in a type of DoS attack, some ISPs may filter out packets with a TTL that is set too high (since it can needlessly consume processor cycles).
DNS queries work no problem, but connecting is the part that seems to have a problem. Running wireshark, I noticed that the main contributor seems to be TCP checksum errors. Offloading is not the problem because the checksums are always off by 1 (for example, correct checksum could be 0x1234 when the segment might have a checksum of 0x1233). It seems some packets "work" while others don't...and for example just opening something as simple as http://google.com might take about 2-3 minutes for the page to fully load.
After reading through the thread, this caught my attention; reason being that DNS is UDP and you have no issues. If you could, next time you notice this problem, try connecting a game to a server (as most games run over UDP) and see what happens.
Yeah I'm pretty sure I'd be able to connect to a game server (if I knew how...don't really play games) if it was through UDP. ICMP (ping) packets are fine too.
It's hard to fully debug this problem because I need to see what the other end sees (and sends). Does the remote server even receive my SYN's or are the packets invalid & they're dropping them? It would be interesting if the tcp checksums mysteriously change. I don't doubt AT&T (my ISP) probably having something to do with it but then again if they were, I should still have the problem after rebooting. That's the big thing....after rebooting the problem is guaranteed to go away which makes me believe it's a software (or hardware initialization) issue. I also tested to see if I get the problem in Windows and I don't, but the problem could be triggered by something that Linux does (or doesn't do) that Windows doesn't do (or does).
And what makes it harder is the fact that I haven't figured out what triggers the problem meaning I have to just wait for it to pop up and hope that I'm in the mood to actually do some debugging, which, isn't always. Most times I just say cut the crap and reboot.
I tried the suggestions listed on the link but they didn't work because the via-rhine driver in its current state (126.96.36.199) doesn't support offloading (tx or rx). I think the device itself does as I've seen a 2.6.22 patch floating around that enables it, and I might give that a try soon.
# ethtool -k eth0
Offload parameters for eth0:
Cannot get device rx csum settings: Operation not supported
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off
All the diagnostic commands of ethtool are unsupported as well (result in Operation not supported).
I should've tried them one by one, but here's what I did:
1. "It" happened
2. After about 2 minutes of "it" happening, I changed the tcp params above
3. "It" was gone on the first connection attempt after changing them.
I need to research those params.
But what I'm gonna do is throw those into a script. As soon as it happens again, I'm gonna instantly run that script. If I'm back up right after, then the problem resolution has been isolated.
Last edited by debuser123; 07-11-2008 at 11:07 AM.
I attempted to connect to eBay, the ip address was 188.8.131.52 this is one of their cgi servers (cgi-core.ebay.com). Originally I noticed that I was unable to completely load pages from them. Further investigation showed that it was only to certain IP addresses. After some additional looking I found that a Windoze machine was able to connect to this address with no problem.
after looking at a network trace I found that the Linux machine correctly sent a SYN packet but no SYN/ACK was returned. The windows machine sent the SYN and promptly got a SYN/ACK in response. There were obvious differences in the options used in the SYN packet, primarily TCP Timestamps were enabled on the Linux system. After disabling the Timestamps connectivity was restored.
Lol, kind of makes sense if you read the title of this thread.
So, I think I am gonna blame it on my ISP...There's probably a traffic-shaping device that's messing with me or a bad box somewhere along some internet routes.
I think I can now put this issue to rest and just leave timestamps disabled altogether; thanks to all who offered help & suggestions.
Last edited by debuser123; 07-11-2008 at 12:32 PM.