NETDEV WATCHDOG: eth0: transmit timed out
Hi, Penguins.
This is a kind of one of old problems. I am hitting this error. (Ethernet dies, under some condition) My system: Toshiba Satellite A100 ST2311. eth0 is RTL8139 B/D. I checked with kernels 2.6.15, 2.6.17, 2.6.18. The error I got (through dmesg) is Code:
NETDEV WATCHDOG: eth0: transmit timed out 1. Disable APCI -> My BIOS does not have that option. 2. check IRQ. -> ifconfig says "Interrupt:185 Base address:0x2000" (185!?) Does anybody know how to solve this? Happy Penguins! |
I had a similar problem in the past. I think my problem was related to apm or apci. Does `ifconfig eth0 down; ifconfig eth0 up` bring your device back to life? Or, alternatively, rmmod rtl8139; insmod rtl8139 || modprobe rtl8139` (assuming your loading your network device as a module)?
Did you recompile your kernel leaving power management support out or are you assuming power management was left out? "Assuming"is the king of all Mother Fathers. [use your imagination] |
Thanks,
I agree with you. This problem seems to be related ACPI, (or apci .. typo ??) Following I did: unselect ACPI and compile kernel -> results in Code:
VFS: Cannot open root device "801" or unknown-block(8,1) It seems that several other penguins have had trouble with this message and ACPI. I will work on this and post update. Happy penguins! |
Hi,
1. I found several posts which say that apic is related. -> with kernel option "noapic", situation got worse. Almost instantly, eth0 dies!!. 2. I also found several posts which say that ACPI is related. -> with kernel option "acpi=no" results in kernel panic at startup!! 3. So far, kernel 2.6.18 and 2.6.15 dies faster than 2.6.17. eth0 can survive about an hour with 2.6.15 and 18, on the other hand 2.6.17, it survives more than several hours. 4. Happy Penguins! |
I fought this for over a month on a system with three different ethernet controllers, a VIA VT310-DP with e100, Rhine-II, and VIA Velocity interfaces. Thought it was the controller/driver but eventually every interface exhibited the same type of error as I updated others. The error sometimes didn't occur for 24 hours, sometimes in 30 minutes, and under varying traffic loads, day or night.
My fix? I booted the kernel with the "pci=noacpi" option. I needed the functionality of ACPI for things like the power button so completely disabling ACPI wasn't possible, but haven't noticed any significant difference in how the machine operates with the less-intrusive PCI option. I'm using the e100 and velocity interfaces, with the Rhine now disconnected. I'm currently on kernel 2.6.18 with e100 driver version 3.5.17-k2-NAPI, via-velocity version 1.14, and via-rhine version 1.4.1 but had this problem with 2.6.17 as well (I started with that version on this platform so don't have any experience with earlier versions). Hope this helps someone. This drove me crazy as it was impossible to reproduce on a non-production system. Flood-pinged one for hours and it never flinched, while the one installed as a NAT gateway (with 300 hosts behind it) rarely ran for more than 12 hours. |
There appeared to be a significant change in interrupt handling in 2.6.19 but the problem persists, unchanged.
I also discovered that although ACPI isn't completely disabled with the pci=noacpi option, it is broken. The type of interrupt (edge or level) gets set incorrectly so ACPI events are never detected. It's supposed to be 'level' but is set to 'edge' instead. So much for having the power button work. |
Hi,
I tried many things, including sending hardware back to manufactuer facility. I have not seen network trouble in wIdNowS. I tried kernels 2.6.15-18 and several different options and more or less same. (2.6.19 did not kick in, although I have not spent time.) Also, there is another thread. http://www.linuxquestions.org/questi...d.php?t=515819 I somewhat think this may be intrinsic to hardware.... lspci says Code:
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) Code:
*-network:1 |
Hi,
Further experiment on my penguin is still without good news. (acpi verbose mode does not give me much info...) Further web search is a kind of disappointing. http://bugzilla.kernel.org/show_bug.cgi?id=6138 http://lkml.org/lkml/2004/5/24/4 http://lkml.org/lkml/2004/11/1/9 I am not quite good to trace above threads. Status is "NEW" in bugzilla, so not much hope?? Anyway, Happy Penguins! |
Other threads suggest forcing the ethernet device to do no auto-negotiation with mii-tool or ethtool (whatever it accepts). If this works but it is not a kernel module driver option you can add to modprobe.conf or equiv, you'll have to set it each time using your distro's networking script(s). In addition http://www.linuxquestions.org/questi...d.php?t=347599 mentions disabling the mDNSResponder service.
|
Hi,
Thank you for your post. However, disabling autoneg does not help, (or situation gets worse) and I do not have mDNSResponder running. One thing I notice is dmesg recognize as follows (which does not agree with lspci or lshw ??) Code:
eth0: Identified 8139 chip type 'RTL-8100B/8139D' |
Hi kaz2100,
I have exactly the same problem with mine Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) network card. It was working fine in Ubuntu 6.06 with 2.6.15-23 kernel until recently. It suddenly decided to stop working throwing the same NETDEV WATCHDOG error like in your case on boot up. It might be ACPI related error but I'm not sure. I can see some ACPI messages related to the network card like: ACPI: PCI interrupt for device 0000:03:07.0 disabled ACPI: PCI interrupt 0000:03:07.0[A] -> GSI 16 (level, low) -> IRQ 169 where 0000:03:07.0 is the PCI address of the network card. I have also noticed that the same interrupt 169 was given to the graphic card. The modules I have loaded for this card are 8139cp and 8139too. Any update about this issue? |
Hi,
Unfortunately, my penguin has still same problem..... Happy Penguins! |
Hi,
Just to inform you that I have solved the problem. I was lucky I still have dual boot on my laptop. Yes, unfortunately windows helped me to bring the network card into live. I restarted in windows and the switch port where the card was connected started flashing. When I got the windows logging screen I restarted again but in Linux this time. During the restart I could see the port light on the switch working all the time and when I logged in the card was working! So somehow I guess windows has restarted the network card in a way Linux was not able to. It must be something with the driver not able to establish initial communication on the network interface after some strange event happens like power outage or irregular system power off which blocks the card TX capability. Cheers, |
I had this same problem, very mysterious. I would get the NETDEV WATCHDOG timeout error, seemingly at random times, and my network interface would stop working until I rebooted the whole machine. I partially disabled ACPI in the kernel (not APIC, these are two different things not a typo,) by entering the following line in my /etc/lilo.conf
append = "pci=noacpi" Since then I have not (yet?) had a recurrence of the problem. |
Hya
Thanks, nemestrinus For some reason, append="pci=noapci" in lilo.conf results in kernel panic with my penguin. Thanks, igorc Just restart within penguin is good enough to reset trouble. Then after a while, dead again.... Happy Penguins! |
All times are GMT -5. The time now is 12:46 AM. |