LinuxQuestions.org - NETDEV WATCHDOG: eth0: transmit timed out

Page 1 of 2

Show 50 post(s) from this thread on one page

- Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)

- - NETDEV WATCHDOG: eth0: transmit timed out (https://www.linuxquestions.org/questions/linux-networking-3/netdev-watchdog-eth0-transmit-timed-out-492159/)

kaz2100

10-13-2006 02:13 PM

NETDEV WATCHDOG: eth0: transmit timed out

Hi, Penguins.

This is a kind of one of old problems.

I am hitting this error. (Ethernet dies, under some condition)

My system:
Toshiba Satellite A100 ST2311.
eth0 is RTL8139 B/D.
I checked with kernels 2.6.15, 2.6.17, 2.6.18.

The error I got (through dmesg) is

Code:

NETDEV WATCHDOG: eth0: transmit timed out

eth0: Transmit timeout, status 0c 0005 c07f media 00.

eth0: Tx queue start entry 4  dirty entry 0.

eth0:  Tx descriptor 0 is 0008a03c. (queue head)

eth0:  Tx descriptor 1 is 0008a03c.

eth0:  Tx descriptor 2 is 0008a03c.

eth0:  Tx descriptor 3 is 0008a03c.

eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1

NETDEV WATCHDOG: eth0: transmit timed out

I checked several sites and they say:
1. Disable APCI -> My BIOS does not have that option.
2. check IRQ. -> ifconfig says "Interrupt:185 Base address:0x2000" (185!?)

Does anybody know how to solve this?

Happy Penguins!

Blindsight

10-13-2006 11:42 PM

I had a similar problem in the past. I think my problem was related to apm or apci. Does `ifconfig eth0 down; ifconfig eth0 up` bring your device back to life? Or, alternatively, rmmod rtl8139; insmod rtl8139 || modprobe rtl8139` (assuming your loading your network device as a module)?

Did you recompile your kernel leaving power management support out or are you assuming power management was left out? "Assuming"is the king of all Mother Fathers. [use your imagination]

kaz2100

10-16-2006 04:00 PM

Thanks,

I agree with you. This problem seems to be related ACPI, (or apci .. typo ??)

Following I did:
unselect ACPI and compile kernel -> results in

Code:

VFS: Cannot open root device "801" or unknown-block(8,1)

and kernel panic.

It seems that several other penguins have had trouble with this message and ACPI.

I will work on this and post update.

Happy penguins!

kaz2100

10-17-2006 03:50 PM

Hi,

1. I found several posts which say that apic is related. -> with kernel option "noapic", situation got worse. Almost instantly, eth0 dies!!.

2. I also found several posts which say that ACPI is related. -> with kernel option "acpi=no" results in kernel panic at startup!!

3. So far, kernel 2.6.18 and 2.6.15 dies faster than 2.6.17. eth0 can survive about an hour with 2.6.15 and 18, on the other hand 2.6.17, it survives more than several hours.

4. Happy Penguins!

kgirrard

12-04-2006 07:43 PM

I fought this for over a month on a system with three different ethernet controllers, a VIA VT310-DP with e100, Rhine-II, and VIA Velocity interfaces. Thought it was the controller/driver but eventually every interface exhibited the same type of error as I updated others. The error sometimes didn't occur for 24 hours, sometimes in 30 minutes, and under varying traffic loads, day or night.

My fix? I booted the kernel with the "pci=noacpi" option. I needed the functionality of ACPI for things like the power button so completely disabling ACPI wasn't possible, but haven't noticed any significant difference in how the machine operates with the less-intrusive PCI option. I'm using the e100 and velocity interfaces, with the Rhine now disconnected.

I'm currently on kernel 2.6.18 with e100 driver version 3.5.17-k2-NAPI, via-velocity version 1.14, and via-rhine version 1.4.1 but had this problem with 2.6.17 as well (I started with that version on this platform so don't have any experience with earlier versions).

Hope this helps someone. This drove me crazy as it was impossible to reproduce on a non-production system. Flood-pinged one for hours and it never flinched, while the one installed as a NAT gateway (with 300 hosts behind it) rarely ran for more than 12 hours.

kgirrard

12-09-2006 11:21 AM

There appeared to be a significant change in interrupt handling in 2.6.19 but the problem persists, unchanged.

I also discovered that although ACPI isn't completely disabled with the pci=noacpi option, it is broken. The type of interrupt (edge or level) gets set incorrectly so ACPI events are never detected. It's supposed to be 'level' but is set to 'edge' instead. So much for having the power button work.

kaz2100

01-31-2007 11:29 AM

Hi,

I tried many things, including sending hardware back to manufactuer facility. I have not seen network trouble in wIdNowS. I tried kernels 2.6.15-18 and several different options and more or less same. (2.6.19 did not kick in, although I have not spent time.)
Also, there is another thread.
http://www.linuxquestions.org/questi...d.php?t=515819
I somewhat think this may be intrinsic to hardware....

lspci says

Code:

Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)

lshw says

Code:

 *-network:1

                description: Ethernet interface

                product: RTL-8139/8139C/8139C+

                vendor: Realtek Semiconductor Co., Ltd.

                physical id: 7

                bus info: pci@02:07.0

                logical name: eth0

                version: 10

                serial: 00:a0:d1:2e:b1:ea

                size: 10MB/s

                capacity: 100MB/s

                width: 32 bits

                clock: 33MHz

                capabilities: bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd autonegotiation

                configuration: autonegotiation=on broadcast=yes driver=8139too driverversion=0.9.27 duplex=half ip=129.22.227.82 latency=64 link=no maxlatency=64 mingnt=32 multicast=yes port=MII speed=10MB/s

                resources: ioport:a000-a0ff iomemory:c0211000-c02110ff irq:193

In any case, Happy Penguins!

kaz2100

02-09-2007 01:38 PM

Hi,

Further experiment on my penguin is still without good news. (acpi verbose mode does not give me much info...)

Further web search is a kind of disappointing.

http://bugzilla.kernel.org/show_bug.cgi?id=6138
http://lkml.org/lkml/2004/5/24/4
http://lkml.org/lkml/2004/11/1/9

I am not quite good to trace above threads. Status is "NEW" in bugzilla, so not much hope??

Anyway, Happy Penguins!

unSpawn

02-12-2007 06:33 AM

Other threads suggest forcing the ethernet device to do no auto-negotiation with mii-tool or ethtool (whatever it accepts). If this works but it is not a kernel module driver option you can add to modprobe.conf or equiv, you'll have to set it each time using your distro's networking script(s). In addition http://www.linuxquestions.org/questi...d.php?t=347599 mentions disabling the mDNSResponder service.

kaz2100

02-12-2007 11:32 AM

Hi,

Thank you for your post.

However, disabling autoneg does not help, (or situation gets worse) and I do not have mDNSResponder running.

One thing I notice is dmesg recognize as follows (which does not agree with lspci or lshw ??)

Code:

eth0: Identified 8139 chip type 'RTL-8100B/8139D'

Happy Penguins!

igorc

03-27-2007 02:25 AM

Hi kaz2100,

I have exactly the same problem with mine Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) network card. It was working fine in Ubuntu 6.06 with 2.6.15-23 kernel until recently. It suddenly decided to stop working throwing the same NETDEV WATCHDOG error like in your case on boot up.

It might be ACPI related error but I'm not sure. I can see some ACPI messages related to the network card like:
ACPI: PCI interrupt for device 0000:03:07.0 disabled
ACPI: PCI interrupt 0000:03:07.0[A] -> GSI 16 (level, low) -> IRQ 169

where 0000:03:07.0 is the PCI address of the network card. I have also noticed that the same interrupt 169 was given to the graphic card.

The modules I have loaded for this card are 8139cp and 8139too.

Any update about this issue?

kaz2100

03-29-2007 05:42 AM

Hi,

Unfortunately, my penguin has still same problem.....

Happy Penguins!

igorc

03-29-2007 10:55 PM

Hi,

Just to inform you that I have solved the problem. I was lucky I still have dual boot on my laptop. Yes, unfortunately windows helped me to bring the network card into live. I restarted in windows and the switch port where the card was connected started flashing. When I got the windows logging screen I restarted again but in Linux this time. During the restart I could see the port light on the switch working all the time and when I logged in the card was working!
So somehow I guess windows has restarted the network card in a way Linux was not able to. It must be something with the driver not able to establish initial communication on the network interface after some strange event happens like power outage or irregular system power off which blocks the card TX capability.

Cheers,

nemestrinus

03-30-2007 01:43 PM

I had this same problem, very mysterious. I would get the NETDEV WATCHDOG timeout error, seemingly at random times, and my network interface would stop working until I rebooted the whole machine. I partially disabled ACPI in the kernel (not APIC, these are two different things not a typo,) by entering the following line in my /etc/lilo.conf

append = "pci=noacpi"

Since then I have not (yet?) had a recurrence of the problem.

kaz2100

03-31-2007 06:06 PM

Hya

Thanks, nemestrinus
For some reason, append="pci=noapci" in lilo.conf results in kernel panic with my penguin.

Thanks, igorc
Just restart within penguin is good enough to reset trouble. Then after a while, dead again....

Happy Penguins!

All times are GMT -5. The time now is 12:46 AM.

Page 1 of 2

Show 50 post(s) from this thread on one page