Fedora This forum is for the discussion of the Fedora Project. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
|
07-02-2006, 04:32 PM
|
#1
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Rep:
|
FC5, 2.6.16 kernel, drops ethernet after 10+ minutes
Fedora Core 5 network problems
Problem:
Network interface fails after about 5-10 minutes. ifdown/ifup does not bring the
interface back up.
Hardware: E-machines T6528
Processor: Athlon 64 2.2 Ghz
Motherboard: MSI MS-7207
Built in Ethernet: nVidia Corporation MCP51 Ethernet Controller, using forcedeth driver.
PCI Ethernet Card: Linksys NC100 Network Everywhere Fast Ethernet 10/100, using tulip driver
OS: Fedora Core 5 x86
History:
After the initial install, I did not have ethernet through the built in ethernet. I cannibalized the Linksys card from an old dead Linux box. This caused an IRQ conflict. I re-installed the OS (Switched away from the 64 bit version in the process, because I was having other issues), this time only installed the Linksys (the other interface still shows up in the gnome GUI network manager, but it's disabled). The Linksys card is eth0. It would stay up for two to three minutes at a time, then would fail.
Some googling yielded a link which I can't post yet, which suggested that this was a bug in the stock 2.6.15 kernel, affecting the tulip driver. I upgraded to 2.6.16-1.2122_FC5. At first, I thought that I had entirely fixed the problem; mean time between failure went from around 2 minutes to over 10 minutes.
Here's what happens during a constant ping, when the network goes down:
....
64 bytes from 192.168.1.1: icmp_seq=235 ttl=64 time= 0.739 ms
ping: sendmsg: No buffer space available
....
Here's what happens when I stop and start the network while the interface is up:
[root@baz ~]# service network stop
Shutting down interface eth0: [ OK ]
Shutting down loopback interface: [ OK ]
[root@baz ~]# service network start
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
Here are the specifics:
/etc/modprobe.conf:
alias eth1 forcedeth
alias scsi_hostadapter sata_nv
alias snd-card-0 snd-hda-intel
options snd-card-0 index=0
options snd-hda-intel index=0
remove snd-hda-intel { /usr/sbin/alsactl store 0 >/dev/null 2>&1 || : ; }; /sbin/modprobe -r --ignore-remove snd-hda-intel
alias eth0 tulip
here's the output from
grep eth0 /var/log/messages
Jul 2 04:05:20 baz kernel: NETDEV WATCHDOG: eth0: transmit timed out
... snip many lines of the same ...
Jul 2 08:34:45 baz kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jul 2 08:47:22 baz avahi-daemon[1932]: Leaving mDNS multicast group on interfac e eth0.IPv4 with address 192.168.1.98 .
Jul 2 08:47:25 baz kernel: NETDEV WATCHDOG: eth0: transmit timed out
{ reboot here }
Jul 2 14:34:05 baz avahi-daemon[1979]: New relevant interface eth0.IPv4 for mDN S.
Jul 2 14:34:05 baz avahi-daemon[1979]: Joining mDNS multicast group on interfac e eth0.IPv4 with address 192.168.1.98.
Jul 2 14:34:05 baz avahi-daemon[1979]: Registering new address record for 192.1 68.1.98 on eth0.
Jul 2 14:34:07 baz kernel: eth0: ADMtek Comet rev 17 at d88a0c00, 00:04:5A:6E:6 6:37, IRQ 5.
Jul 2 14:34:08 baz kernel: eth0: Setting full-duplex based on MII#1 link partne r capability of 45e1.
Here's the configuration for eth0:
[tiger@baz modprobe.d]$ cat /etc/sysconfig/network-scripts/ifcfg-eth0
# Linksys NC100 Network Everywhere Fast Ethernet 10/100
DEVICE=eth0
BOOTPROTO=none
HWADDR=00:04:5a:6e:66:37
ONBOOT=yes
DHCP_HOSTNAME=baz.localdomain
TYPE=Ethernet
USERCTL=no
IPV6INIT=no
PEERDNS=yes
IPADDR=192.168.1.98
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
Here's the output from ifconfig while the network is up. I don't think that it looks any different after the network goes down:
[tiger@baz ~]$ /sbin/ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:04:5A:6E:66:37
inet addr:192.168.1.98 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::204:5aff:fe6e:6637/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:985 errors:0 dropped:0 overruns:0 frame:0
TX packets:758 errors:0 dropped:0 overruns:0carrier:0
collisions:0 txqueuelen:1000
RX bytes:358771 (350.3 KiB) TX bytes:234303 (228.8 KiB)
Interrupt:5 Base address:0xc00
...
I'm a little concerned that eth0 has inet6 addr: fe80::204:5aff:fe6e:6637/64,
even though the configuration file shows IPV6INIT=no.
===
The fact that eth0 is failing is the main issue. There are a couple of other problems which may or may not be related:
1) dhclient writes 192.168.1.100 to as the primary DNS server in /etc/resolv.conf. Because this server does not exist, all DNS lookups are slow.I actually wanted a static IP address on this box anyway, so I disabled DHCP and edited /etc/resolv.conf by hand. Nonetheless, this seems broken.
2) After eth0 fubars, when I reboot the box, the kernel hangs while trying to turn off iptables.
===
I'm inclined to think that this is a kernel or driver issue, but I don't really know which it is likely to be, or what to install next.
Oh... I forgot to mention... I rebooted into Windows XP, and ran a constant ping for 2 hours with no packet loss, therefore I don't think it's a network card issue.
--Barton
|
|
|
07-02-2006, 06:35 PM
|
#2
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Original Poster
Rep:
|
Ok... more info.
I had eth0 up for most of the afternoon. I was SSH'd into the FC5 box and I decided to do another constant ping. Here's what I grabbed from PuTTY after the network went down:
64 bytes from 192.168.1.1: icmp_seq=3424 ttl=64 time=0.715 ms
64 bytes from 192.168.1.1: icmp_seq=3425 ttl=64 time=0.724 ms
64 bytes from 192.168.1.1: icmp_seq=3426 ttl=64 time=0.720 ms
64 bytes from 192.168.1.1: icmp_seq=3427 ttl=64 time=0.724 ms
64 bytes from 192.168.1.1: icmp_seq=3428 ttl=64 time=0.724 ms
64 bytes from 192.168.1.1: icmp_seq=3429 ttl=64 time=0.709 ms
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
this means that the network was up and sending messages after ping stopped. I did have a feeling that ping might have triggered the problem...
|
|
|
07-04-2006, 12:28 PM
|
#3
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Original Poster
Rep:
|
I rebooted last night, got the network running again, and left the network up and running without a constant ping. eth0 stayed up for about 2 hours, then failed again. Once again, /var/log/messages shows
baz kernel: NETDEV WATCHDOG: eth0: transmit timed out
due to the error message I was getting with ping earleir
(ping: sendmsg: No buffer space available), I'm guessing that something is keeping sendmsg's buffer from clearing, and it seems that this *not* simply
the network failing, because I'm seeing those messages across the network.
I'm assuming that sendmsg() is being called by ping, and several other programs, which is why the network continues to go down even though I'm not pinging anything. I'm also assuming that sshd is *not* using sendmsg(), which is why I continue to see messages across the network even after the sendmsg buffer is full.
Now... what that actually means for my network, or how to fix it, I have no idea.
|
|
|
07-04-2006, 03:50 PM
|
#4
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Original Poster
Rep:
|
dmesg | grep Tulip
Linux Tulip driver version 1.1.13 (May 11, 2002)
|
|
|
07-04-2006, 10:56 PM
|
#5
|
LQ Newbie
Registered: Mar 2005
Location: Ohio
Distribution: FC6->F7
Posts: 23
Rep:
|
The tulip is a well supported chip if that's your eth0 - that's not the issue. The IPv6 address ok OK, no worries. You aren't giving us much to work with.
Are you running NetworkManager and NetworkManagerDispatcher ? These are relatively new services and seem to be a little 'twitchy'. They can take down your connection. Disable them (System->Administration->ServerSettings->Services) unclick the services, save & reboot.
You also have an eth1 alias. Is it up ? does it stay up ?
There is some sort of unresolved glitch st after a suspend/resume the ethernet handles (eth1, eth0) may be scrambled. Check that the HWaddr for eth0 remains constant before/after the failure (use ifconfig -a).
The only other thought is that you should boot to single user mode and test the problem. Use the grub interface to add the word "single" to the kernel command line. (I think you type any character at the grub splash-screen, then type an 'e' for edit, then add the characters " single" and hit return). You'll get a tty console as root (no X11). Then try the periodic ping with a command like
# while true; do ping 192.168.1.1 ; sleep 30; done
You may have to manually bring up the interface with 'ifup eth0'. DON'T start the network services - too many possible implications.
|
|
|
07-05-2006, 06:54 AM
|
#6
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Original Poster
Rep:
|
Quote:
Originally Posted by steve-alexander
The tulip is a well supported chip if that's your eth0 - that's not the issue. The IPv6 address ok OK, no worries. You aren't giving us much to work with.
Are you running NetworkManager and NetworkManagerDispatcher ? These are relatively new services and seem to be a little 'twitchy'. They can take down your connection. Disable them (System->Administration->ServerSettings->Services) unclick the services, save & reboot.
You also have an eth1 alias. Is it up ? does it stay up ?
|
I looked at the services, NetworkManager and NetworkManagerDispacher are both disabled.
eth1 has been disabled in the bios, and is not enabled on boot.
Quote:
Originally Posted by steve-alexander
There is some sort of unresolved glitch st after a suspend/resume the ethernet handles (eth1, eth0) may be scrambled. Check that the HWaddr for eth0 remains constant before/after the failure (use ifconfig -a).
|
I saw this change *once* a few days ago. I've kept an eye on it ever since, and in all the times that the network has failed since then, the HWaddr has stayed constant.
I tried 'init 1', I tried to enter single user mode through grub... I finally had to edit /etc/inittab to get into single user mode. I'm currently running a constant ping to my router. I let that run for a couple of hours; this should be enough to re-create the problem if it exists in single user mode.
|
|
|
07-05-2006, 10:21 AM
|
#7
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Original Poster
Rep:
|
Under single user mode, I brought up eth0 using
ifup eth0
I ran a constant ping for 7200 seconds, with no packet loss.
I did an 'ifdown eth0', then as a control, brought the network up using
service network start
I've run a constant ping for over an hour, and the network is currently still up.
I'm considering booting back into multi-user mode and running something like
time (while ping -c 1 192.168.1.1; do sleep 20; done)
to give me some sense of mean-time between failure, because I don't know just how much confidence I have that the trouble will occur in 2 hours... my gut feeling is that two hours should be enough, but I would hate to rule something out as a problem, only to find out that if I had run the ping for 15 minutes longer, it would have failed...
My next step is to boot back into multi-user mode, test the time between failures, and do some research on how runlevel 5 brings up the network, and how that differs from 'service network start' when run in single user mode.
|
|
|
07-05-2006, 12:46 PM
|
#8
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Original Poster
Rep:
|
The first time I ran the timing command, it ran for 7 minutes and 50 seconds.
The second time, I ran a flood ping from another Linux box to this one... the network went down. I rebooted and did this agian, and the same thing happened. I had done a ping flood to the box earlier while I had it in single user mode
there was 0% packet loss at that time, and eth0 stayed up.
|
|
|
07-08-2006, 02:33 PM
|
#9
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Original Poster
Rep:
|
Well... I've decided to throw in the towel, wipe the hard drive and install FC4.
|
|
|
07-09-2006, 01:30 AM
|
#10
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Original Poster
Rep:
|
FC4 is up and running, eth0 is running smoothly.
|
|
|
07-19-2006, 08:33 PM
|
#11
|
Senior Member
Registered: Apr 2005
Location: Penguin land, with apple, no gates
Distribution: SlackWare > Debian testing woody(32) sarge etch lenny squeeze(+64) wheezy .. bullseye bookworm
Posts: 1,849
Rep: 
|
Hi,
I also had similar problem (dead network after a while) with my Penguin.
Debian etch
kernel 2.6.17.4
Toshiba satellite A100 ST2311 -> please refer HCL, I submitted.
I just turned off watchdog at ketnel config so far my penguin is healthy.
|
|
|
07-20-2006, 12:00 AM
|
#12
|
Member
Registered: Jan 2006
Distribution: Fedora 4, 5
Posts: 30
Rep:
|
Just wondering, do you SELinux enabled? Or any other kind of firewall? FC5 is still a
bit ragged around the edges. A hang or glitch there could concievably cause problems like yours.
|
|
|
07-20-2006, 04:27 PM
|
#13
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Original Poster
Rep:
|
Well I'll be... I figured the watchdog was just reporting the error... never ocurred to me that it might be *causing* the error.
Which kind of begs the question... what does watchdog do, anyway?
Moonlit: SELinux was enabled, but it was in logging mode only.
|
|
|
07-21-2006, 07:56 PM
|
#14
|
Member
Registered: Jan 2006
Distribution: Fedora 4, 5
Posts: 30
Rep:
|
If SE Linux was only in logging mode, then I doubt it's causeing the problem. I have a simular problem though. Or had, rather. I have 2 comps, one with Win2k installed, and one with FC5. I have them conected to a router, which is in turn connected to my cable modem. I had a problem with my internet connection dropping out on me on one or both systems. I finally reprogrammed the router to auto discover it's connection settings. So far, no more problem.
|
|
|
07-24-2006, 03:23 PM
|
#15
|
Senior Member
Registered: Apr 2005
Location: Penguin land, with apple, no gates
Distribution: SlackWare > Debian testing woody(32) sarge etch lenny squeeze(+64) wheezy .. bullseye bookworm
Posts: 1,849
Rep: 
|
I have to say, "watchdog is only reporting" seems to be correct. But, my penguin is totally healthy ever since watchdog is truned off. When it was on, I had dead network every once in short while.
My guess is, something like IRQ conflict or ACPI, APM related????
|
|
|
All times are GMT -5. The time now is 02:26 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|