LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Reply
  Search this Thread
Old 10-13-2006, 02:13 PM   #1
kaz2100
Senior Member
 
Registered: Apr 2005
Location: Penguin land, with apple, no gates
Distribution: SlackWare > Debian testing woody(32) sarge etch lenny squeeze(+64) wheezy .. bullseye bookworm
Posts: 1,832

Rep: Reputation: 108Reputation: 108
NETDEV WATCHDOG: eth0: transmit timed out


Hi, Penguins.

This is a kind of one of old problems.

I am hitting this error. (Ethernet dies, under some condition)

My system:
Toshiba Satellite A100 ST2311.
eth0 is RTL8139 B/D.
I checked with kernels 2.6.15, 2.6.17, 2.6.18.

The error I got (through dmesg) is
Code:
NETDEV WATCHDOG: eth0: transmit timed out
eth0: Transmit timeout, status 0c 0005 c07f media 00.
eth0: Tx queue start entry 4  dirty entry 0.
eth0:  Tx descriptor 0 is 0008a03c. (queue head)
eth0:  Tx descriptor 1 is 0008a03c.
eth0:  Tx descriptor 2 is 0008a03c.
eth0:  Tx descriptor 3 is 0008a03c.
eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1
NETDEV WATCHDOG: eth0: transmit timed out
I checked several sites and they say:
1. Disable APCI -> My BIOS does not have that option.
2. check IRQ. -> ifconfig says "Interrupt:185 Base address:0x2000" (185!?)

Does anybody know how to solve this?

Happy Penguins!
 
Old 10-13-2006, 11:42 PM   #2
Blindsight
Member
 
Registered: Mar 2003
Distribution: Slackware
Posts: 234

Rep: Reputation: 30
I had a similar problem in the past. I think my problem was related to apm or apci. Does `ifconfig eth0 down; ifconfig eth0 up` bring your device back to life? Or, alternatively, rmmod rtl8139; insmod rtl8139 || modprobe rtl8139` (assuming your loading your network device as a module)?

Did you recompile your kernel leaving power management support out or are you assuming power management was left out? "Assuming"is the king of all Mother Fathers. [use your imagination]
 
Old 10-16-2006, 04:00 PM   #3
kaz2100
Senior Member
 
Registered: Apr 2005
Location: Penguin land, with apple, no gates
Distribution: SlackWare > Debian testing woody(32) sarge etch lenny squeeze(+64) wheezy .. bullseye bookworm
Posts: 1,832

Original Poster
Rep: Reputation: 108Reputation: 108
Thanks,

I agree with you. This problem seems to be related ACPI, (or apci .. typo ??)

Following I did:
unselect ACPI and compile kernel -> results in
Code:
VFS: Cannot open root device "801" or unknown-block(8,1)
and kernel panic.

It seems that several other penguins have had trouble with this message and ACPI.

I will work on this and post update.

Happy penguins!
 
Old 10-17-2006, 03:50 PM   #4
kaz2100
Senior Member
 
Registered: Apr 2005
Location: Penguin land, with apple, no gates
Distribution: SlackWare > Debian testing woody(32) sarge etch lenny squeeze(+64) wheezy .. bullseye bookworm
Posts: 1,832

Original Poster
Rep: Reputation: 108Reputation: 108
Hi,

1. I found several posts which say that apic is related. -> with kernel option "noapic", situation got worse. Almost instantly, eth0 dies!!.

2. I also found several posts which say that ACPI is related. -> with kernel option "acpi=no" results in kernel panic at startup!!

3. So far, kernel 2.6.18 and 2.6.15 dies faster than 2.6.17. eth0 can survive about an hour with 2.6.15 and 18, on the other hand 2.6.17, it survives more than several hours.

4. Happy Penguins!
 
Old 12-04-2006, 07:43 PM   #5
kgirrard
LQ Newbie
 
Registered: Dec 2006
Posts: 3

Rep: Reputation: 0
I fought this for over a month on a system with three different ethernet controllers, a VIA VT310-DP with e100, Rhine-II, and VIA Velocity interfaces. Thought it was the controller/driver but eventually every interface exhibited the same type of error as I updated others. The error sometimes didn't occur for 24 hours, sometimes in 30 minutes, and under varying traffic loads, day or night.

My fix? I booted the kernel with the "pci=noacpi" option. I needed the functionality of ACPI for things like the power button so completely disabling ACPI wasn't possible, but haven't noticed any significant difference in how the machine operates with the less-intrusive PCI option. I'm using the e100 and velocity interfaces, with the Rhine now disconnected.

I'm currently on kernel 2.6.18 with e100 driver version 3.5.17-k2-NAPI, via-velocity version 1.14, and via-rhine version 1.4.1 but had this problem with 2.6.17 as well (I started with that version on this platform so don't have any experience with earlier versions).

Hope this helps someone. This drove me crazy as it was impossible to reproduce on a non-production system. Flood-pinged one for hours and it never flinched, while the one installed as a NAT gateway (with 300 hosts behind it) rarely ran for more than 12 hours.
 
Old 12-09-2006, 11:21 AM   #6
kgirrard
LQ Newbie
 
Registered: Dec 2006
Posts: 3

Rep: Reputation: 0
There appeared to be a significant change in interrupt handling in 2.6.19 but the problem persists, unchanged.

I also discovered that although ACPI isn't completely disabled with the pci=noacpi option, it is broken. The type of interrupt (edge or level) gets set incorrectly so ACPI events are never detected. It's supposed to be 'level' but is set to 'edge' instead. So much for having the power button work.
 
Old 01-31-2007, 11:29 AM   #7
kaz2100
Senior Member
 
Registered: Apr 2005
Location: Penguin land, with apple, no gates
Distribution: SlackWare > Debian testing woody(32) sarge etch lenny squeeze(+64) wheezy .. bullseye bookworm
Posts: 1,832

Original Poster
Rep: Reputation: 108Reputation: 108
Hi,

I tried many things, including sending hardware back to manufactuer facility. I have not seen network trouble in wIdNowS. I tried kernels 2.6.15-18 and several different options and more or less same. (2.6.19 did not kick in, although I have not spent time.)
Also, there is another thread.
http://www.linuxquestions.org/questi...d.php?t=515819
I somewhat think this may be intrinsic to hardware....

lspci says
Code:
 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
lshw says
Code:
 *-network:1
                description: Ethernet interface
                product: RTL-8139/8139C/8139C+
                vendor: Realtek Semiconductor Co., Ltd.
                physical id: 7
                bus info: pci@02:07.0
                logical name: eth0
                version: 10
                serial: 00:a0:d1:2e:b1:ea
                size: 10MB/s
                capacity: 100MB/s
                width: 32 bits
                clock: 33MHz
                capabilities: bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd autonegotiation
                configuration: autonegotiation=on broadcast=yes driver=8139too driverversion=0.9.27 duplex=half ip=129.22.227.82 latency=64 link=no maxlatency=64 mingnt=32 multicast=yes port=MII speed=10MB/s
                resources: ioport:a000-a0ff iomemory:c0211000-c02110ff irq:193
In any case, Happy Penguins!
 
Old 02-09-2007, 01:38 PM   #8
kaz2100
Senior Member
 
Registered: Apr 2005
Location: Penguin land, with apple, no gates
Distribution: SlackWare > Debian testing woody(32) sarge etch lenny squeeze(+64) wheezy .. bullseye bookworm
Posts: 1,832

Original Poster
Rep: Reputation: 108Reputation: 108
Hi,

Further experiment on my penguin is still without good news. (acpi verbose mode does not give me much info...)

Further web search is a kind of disappointing.

http://bugzilla.kernel.org/show_bug.cgi?id=6138
http://lkml.org/lkml/2004/5/24/4
http://lkml.org/lkml/2004/11/1/9

I am not quite good to trace above threads. Status is "NEW" in bugzilla, so not much hope??

Anyway, Happy Penguins!
 
Old 02-12-2007, 06:33 AM   #9
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Other threads suggest forcing the ethernet device to do no auto-negotiation with mii-tool or ethtool (whatever it accepts). If this works but it is not a kernel module driver option you can add to modprobe.conf or equiv, you'll have to set it each time using your distro's networking script(s). In addition http://www.linuxquestions.org/questi...d.php?t=347599 mentions disabling the mDNSResponder service.
 
Old 02-12-2007, 11:32 AM   #10
kaz2100
Senior Member
 
Registered: Apr 2005
Location: Penguin land, with apple, no gates
Distribution: SlackWare > Debian testing woody(32) sarge etch lenny squeeze(+64) wheezy .. bullseye bookworm
Posts: 1,832

Original Poster
Rep: Reputation: 108Reputation: 108
Hi,

Thank you for your post.

However, disabling autoneg does not help, (or situation gets worse) and I do not have mDNSResponder running.

One thing I notice is dmesg recognize as follows (which does not agree with lspci or lshw ??)
Code:
eth0:  Identified 8139 chip type 'RTL-8100B/8139D'
Happy Penguins!
 
Old 03-27-2007, 02:25 AM   #11
igorc
Member
 
Registered: May 2005
Location: Sydney, Australia
Distribution: Ubuntu 5.04, Debian 3.1
Posts: 74

Rep: Reputation: 15
Hi kaz2100,

I have exactly the same problem with mine Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) network card. It was working fine in Ubuntu 6.06 with 2.6.15-23 kernel until recently. It suddenly decided to stop working throwing the same NETDEV WATCHDOG error like in your case on boot up.

It might be ACPI related error but I'm not sure. I can see some ACPI messages related to the network card like:
ACPI: PCI interrupt for device 0000:03:07.0 disabled
ACPI: PCI interrupt 0000:03:07.0[A] -> GSI 16 (level, low) -> IRQ 169

where 0000:03:07.0 is the PCI address of the network card. I have also noticed that the same interrupt 169 was given to the graphic card.

The modules I have loaded for this card are 8139cp and 8139too.

Any update about this issue?

Last edited by igorc; 03-27-2007 at 02:36 AM.
 
Old 03-29-2007, 05:42 AM   #12
kaz2100
Senior Member
 
Registered: Apr 2005
Location: Penguin land, with apple, no gates
Distribution: SlackWare > Debian testing woody(32) sarge etch lenny squeeze(+64) wheezy .. bullseye bookworm
Posts: 1,832

Original Poster
Rep: Reputation: 108Reputation: 108
Hi,

Unfortunately, my penguin has still same problem.....

Happy Penguins!
 
Old 03-29-2007, 10:55 PM   #13
igorc
Member
 
Registered: May 2005
Location: Sydney, Australia
Distribution: Ubuntu 5.04, Debian 3.1
Posts: 74

Rep: Reputation: 15
Hi,

Just to inform you that I have solved the problem. I was lucky I still have dual boot on my laptop. Yes, unfortunately windows helped me to bring the network card into live. I restarted in windows and the switch port where the card was connected started flashing. When I got the windows logging screen I restarted again but in Linux this time. During the restart I could see the port light on the switch working all the time and when I logged in the card was working!
So somehow I guess windows has restarted the network card in a way Linux was not able to. It must be something with the driver not able to establish initial communication on the network interface after some strange event happens like power outage or irregular system power off which blocks the card TX capability.

Cheers,
 
Old 03-30-2007, 01:43 PM   #14
nemestrinus
LQ Newbie
 
Registered: Dec 2006
Location: california
Distribution: slackware
Posts: 21

Rep: Reputation: 0
I had this same problem, very mysterious. I would get the NETDEV WATCHDOG timeout error, seemingly at random times, and my network interface would stop working until I rebooted the whole machine. I partially disabled ACPI in the kernel (not APIC, these are two different things not a typo,) by entering the following line in my /etc/lilo.conf

append = "pci=noacpi"

Since then I have not (yet?) had a recurrence of the problem.

Last edited by nemestrinus; 03-30-2007 at 01:50 PM.
 
Old 03-31-2007, 06:06 PM   #15
kaz2100
Senior Member
 
Registered: Apr 2005
Location: Penguin land, with apple, no gates
Distribution: SlackWare > Debian testing woody(32) sarge etch lenny squeeze(+64) wheezy .. bullseye bookworm
Posts: 1,832

Original Poster
Rep: Reputation: 108Reputation: 108
Hya

Thanks, nemestrinus
For some reason, append="pci=noapci" in lilo.conf results in kernel panic with my penguin.

Thanks, igorc
Just restart within penguin is good enough to reset trouble. Then after a while, dead again....


Happy Penguins!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
NETDEV WATCHDOG: eth0: transmit timed out octain Linux - Networking 7 12-30-2008 06:02 PM
NETDEV WATCHDOG: eth0:transmit timed out ricardof Linux - Networking 4 08-09-2008 06:24 PM
NETDEV WATCHDOG: eth0: transmit timed out Regulus Linux - Networking 0 01-04-2006 06:50 AM
NETDEV WATCHDOG: Transmit timed out davcefai Debian 0 07-28-2005 12:15 PM
eth0: Transmit timed out GuPH Linux - Networking 4 12-16-2003 08:16 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 08:15 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration