LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   ath9k driver fails on Crunchbang after certain amount of time (Acer Aspire) (https://www.linuxquestions.org/questions/linux-hardware-18/ath9k-driver-fails-on-crunchbang-after-certain-amount-of-time-acer-aspire-4175438217/)

Rainfly_X 11-21-2012 10:09 PM

ath9k driver fails on Crunchbang after certain amount of time (Acer Aspire)
 
My laptop is basically an Acer Aspire rebranded to be a Gateway Something (T100, I think? The marketing people REALLY did not give a shit). For all intents and purposes, though, it's an Aspire. And it has a slight problem where the wifi gives out after extended periods of the machine running.

Now, this isn't easy to test, because it takes forever to manifest. My current uptime is about 50 days (rounding up from a very top-heavy 49), and I'm only running into it now. I think it might involve some variable in the driver overflowing after a certain amount of bandwidth passes through it. Even so, it has a habit of dying when I actually need it and don't want to close everything and restart. So I'm trying to solve it more permanently now, while it's convenient.

Symptoms:

Hardware light on the right turns off. nm-applet keeps trying to access the network it was just on, prompting for the password. This does not succeed. wlan0 is still listed in ifconfig. Restarting /etc/init.d/networking has no effect.

Attempted treatment:

So far, I've tried unloading and reloading the Atheros wireless kernel modules - ath9k, ath5k, and ath. I know from previous battles that these are in fact the correct drivers - in fact I've actually salvaged the situation before, which is why I know it can be done, but so far it's been dumb luck every time I've succeeded. I'm hoping to document it publicly this time around, for my own sake and for lurkers like myself.

The results of this are that I can't get wlan0 to show up in ifconfig or iwconfig anymore. Nor does "ifup wlan0" work. Reloading the KMs results in output like this in dmesg (timecodes omitted, as I'm having to retype the information manually):

Code:

ath9k 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
ath9k 0000:02:00.0: setting latency timer to 64
ath9k 0000:02:00.0: failed to initialize device
ath9k 0000:02:00.0: PC INT A disabled
ath9k: probe of 0000:02:00.0 failed with error -5

SUSPICIONS:

From my vague recollections of past battles, I'm fairly convinced that this whole thing traces back to power management gone awry. Also, there was a message via the kernel about disabling IRQ 17 which showed up on the terminal earlier, "Disabling IRQ #17", the same time the mess started. Another thing I see, that the uhci_hcd module is also getting its hands on IRQ 17 for some reason or another, and may be the cause of interference, although it's also messing with with the other IRQs 16-19

Rainfly_X 11-21-2012 10:37 PM

I found something that sounds promising. It's an issue with the Debian stable kernel (which I'm experiencing the issue under) where an IRQ disables after several days (what I'm experiencing) and fraks everything up (preach it, brother).

http://lkml.indiana.edu/hypermail/li...05.3/0155.html

Mind you, in this case, it's messing up the OP's RAID system (if I'm reading this correctly). Also, it seems like a driver issue, where under rare circumstances the driver does not return IRQ_handled. While I'd love to help contribute to making the ath9k driver less shaky in any way that I can, Debian stable is not exactly the first kid in line for upstream updates, so I'm still looking for a way to consistently fix the problem after it occurs. Something I can write up as a fix_networking.sh script and post on here when the dust settles. So the hunt, it seems, is not over.

Rainfly_X 11-23-2012 11:45 PM

I've tried removing the conflicting kmod (sudo modprobe -r uhci_hcd) and unloading/reloading the ath drivers, to no effect. And since I've gotten no response here and I need to actually use my laptop for things, I'm going to finally restart it, rather than keep putting it off in an attempt to preserve this broken state for experimentation.

If anyone posts any ideas on here, I'll try them next time I experience this problem, but please be advised that it may be awhile before it happens again. I think I might be able to induce it by repeatedly closing my laptop lid and bringing it back out of sleep, but no guarantees.


All times are GMT -5. The time now is 02:33 PM.