Losing connection to specific servers until reboot
Linux - NetworkingThis forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Losing connection to specific servers until reboot
I have had a networking problem off and on for several years. I could never figure it out before, partly because I saw the problem on only one machine, so there was not much to compare. Now I'm seeing the problem on multiple machines, so it's worth coming back to it.
The problem: Periodically, I lose connection with specific servers. This happens most often with IMAP/SMTP connections to e-mail, but sometimes it happens with specific websites. When this happens, rebooting clears the problem immediately. But, rebooting is a disruption to workflow and I would prefer to avoid it as much as possible.
Observations:
I'll focus on IMAP for simplicity. I have three devices configured to get my e-mail by IMAP:
Before getting the ThinkPad, I used an MSI running Ubuntu 10.04, later upgraded to 12.04. Same problem. That one was using an Atheros WiFi card using the ath9k module. So, the problem is not specific to network hardware or driver.
Tablet: Asus Transformer Pad running Android 4.2.1. E-mail client: AquaMail 1.3.8.
IMAP service: QQ Mail, based in mainland China. imap.qq.com, smtp.qq.com. (I'm living in mainland China. Using a Chinese service should avoid problems with the Great Firewall.)
The problem does NOT happen to all machines at the same time. That is, right now, my computer cannot access the IMAP server for my e-mail, but my phone has no problem accessing it. They are both online through the same WiFi router. I have also seen the phone or the tablet lose contact with IMAP, while the computer and the other Android device are working fine. Therefore, it's unlikely to be a problem with the server or with the ISP. If it were, I would expect all devices to lose connectivity at the same time, and all of them to restore connectivity at once. That's not what I see.
The problem occurs most often with my home DSL connection, but it seems not to be limited to that. My phone has, at times, lost access to IMAP over China Mobile's 4G service, and regained access after a reboot. This isn't guaranteed to be the same problem, but it seems too closely related to be coincidental.
The fact that a reboot temporarily solves the problem tells me that there is some stale information lurking in the networking layer. I'm assured that Linux does not cache DNS info. Okay... then, what is cached? If nothing is being cached, then would the problem not persist after a reboot?
I've seen the problem in four different apartments in Guangzhou, using at least two different ISPs. This also points to an issue in the client machines, rather than servers or network providers.
Android is based on the Linux kernel. I can't confirm that the Android networking stack is the same as the Linux networking stack on my computer, but it doesn't seem coincidental that this problem occurs on three Linux-based devices, but my housemate (using Windows) has never complained about this.
Short version: Multiple ISPs, multiple machines, multiple email clients, but all based on Linux. Reboot always clears it (but only temporarily), so it must be something local.
Does anybody have any ideas? Even random neural firings? The preponderance of evidence points to some bad data lingering in RAM until a reboot purges it. But when I've asked this question before, I've been told that that's not possible. Well, something is causing this...
I am thinking you are using a SOHO router.switch combo your ISP has given you yes?
It sounds to me like a bad SOHO router to me, do you lose conectivity from computer to computer if you plug computer A directly into computer B? if not, i would put my money on a bad switch/router.
I am thinking you are using a SOHO router.switch combo your ISP has given you yes?
It sounds to me like a bad SOHO router to me, do you lose conectivity from computer to computer if you plug computer A directly into computer B? if not, i would put my money on a bad switch/router.
No, it's a DSL modem and a separate wifi router. (Ancient technology, I know.)
Computer-to-computer is a rough test because the issue is sporadic. After a reboot, the network is fine for a day or two, or sometimes longer... but when it breaks, it's done. I can't realistically take my computer off the network and leave it hard-connected to another computer for a few days.
One test I haven't tried is to wait for the issue to happen again, then cycle power on the router. Another possible test is to disable wifi using F8, and then re-enable it. (IIUC, this should reset the wifi kernel module.)
reboot of what helps you to solve this issue? I think - probably - sometimes the router is out of resources, but ...
You can try - instead of rebooting a linux - just remove the wifi kernel module and reload it.
Well, the problem just happened again. So here are the tests I performed.
1. Switch off WiFi using the XF86WLAN key*, and then switch it back on. In my e-mail client (wanderlust in Emacs 24), cycle alt-T to kill any open connection processes so that the next request will start with a fresh connection.
* Someone told me on another forum that disabling WiFi this way would remove the WiFi kernel module, and toggling it back on would restart that module. If that was correct, then the conclusion is valid. If not, could someone tell me the name of the proprietary Broadcomm kernel module?
Result: No connection.
Conclusion: If the kernel module was successfully reset, then the problem has nothing to do with the kernel module.
2. Quit Emacs, relaunch and restart wanderlust.
Result: No connection.
Conclusion: The problem is not a bug in wanderlust.
3. Quit all applications, log out and log back in (without rebooting the machine). Then relaunch wanderlust and try e-mail again.
Result: No connection.
Conclusion: The problem persists through shell sessions.
4. Cycle power on the WiFi router, reset wanderlust's connection and try again.
Result: No connection.
Conclusion: The problem seems not to be in the router. Note also that my phone, connected to the same WiFi router, has no problem accessing the IMAP server. As I'm writing this, only my computer is unable to connect.
Which leaves the observation from my first post -- some information, somewhere in my computer, seems to be stale and the only way to clear it is to reboot. It's hard to imagine how it could be a router or ISP problem, because other devices on the same WiFi router have no problem connecting to an IMAP server that my computer simply can't.
So, now I'm going to reboot my machine so that I can send one e-mail. Just... one... e-mail... and I have to friggen' reboot???
I do appreciate the advice so far. I hope someone has some other ideas, as this is really quite frustrating -- something that really quite obviously should not happen, but this has been going on for four years.
you can try lsmod before and after "Switch off WiFi using the XF86WLAN key", and you will see if there was a missing/removed kernel module.
As far as I see the "switch on" was not successful - you can check lsmod again, and also ifconfig.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.