LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   Strange IRQ problem (https://www.linuxquestions.org/questions/linux-hardware-18/strange-irq-problem-905954/)

rigs 10-01-2011 02:08 PM

Strange IRQ problem
 
I have a machine I use as a firewall. After an undetermined amount of time (could be 10 minutes, could be an hour) certain processes become sluggish.

At first I noticed this with the proprietary nvidia drivers. The UI would become very slow (including typing into a terminal window) restarting X would fix the problem temporarily. As this is quite annoying I eventually dropped to generic drivers, which solved the problem. Then I noticed that after a while I would get ridiculously low throughput on one of my gigabit NICs. In the logs I see

... irq 19: nobody cared (try booting with the "irqpoll" option)
...
... Disabling IRQ #19

Nothing is sharing this IRQ and I also tried swapping out the NIC and putting it and a different one in a different slot. No joy.

The weird part is that in /proc/interrupts the line for this IRQ under CPU0 shows 500001 (what are the odds?).

Any ideas?

mulyadi.santosa 10-02-2011 04:49 AM

Hi rigs....

500001 shots on a single IRQ line might be a high number, but might be just moderate. For comparison, compare it to timer IRQ (IRQ 0).

So, I think that sluggish experience came from something else. We need more info from you. Care to tell us:
- your kernel version and distro name+version?
- output of command "top -b -n 1"

that's all so far, and let's discuss more after you get the data.....

PS: for prevention, try to terminate every processes (including daemon) you don't need. And no X Window whenever possible. The idea is to make your environment as pristine as possible.....thus reducing the number of suspects.

H_TeXMeX_H 10-02-2011 05:36 AM

Also post 'cat /proc/interrputs', 'lspci -k', and 'lsmod'.

When using the nvidia drivers make sure to put:

Code:

    Option "UseEvents" "1"
in the Device section of xorg.conf.

rigs 10-03-2011 02:08 AM

Thanks for the replies, guys.

I think I solved the problem. I'm passing "noirqdebug" on boot, and so far (a day later) the problem has not returned. I've even reinstalled the nvidia proprietary drivers which seem to be working.

It looks like a stray hit on the IRQ causes the kernel to run some debug code every subsequent time the IRQ fires, which as you can imagine slows things down for that IRQ. They really should turn this on by default.

By the way, the weird thing about the 500001 in /proc/interrupts is not that the number is high (it's actually very low) but that it's so round (or maybe "not random").

If someone tells me how to mark this as solved, I'll do so. Or a mod can do it.

business_kid 10-03-2011 12:42 PM

As the OP, you should see an option in thread tools. You can also edit 1st post, go advanced, and edit the subject line. There's a prefix box there with SOLVED in it

mulyadi.santosa 10-04-2011 01:34 PM

Hi rigs :)

Great to hear you solve the problem. The thing is, after I read http://www.linuxtopia.org/online_boo...tion/re18.html, I think that your system might have broken IRQ handling somewhere. Or maybe the logic itself inside the Linux kernel is indeed broken. :)


All times are GMT -5. The time now is 07:56 AM.