LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 12-23-2012, 12:16 PM   #1
spinner0205
LQ Newbie
 
Registered: Dec 2012
Posts: 1

Rep: Reputation: Disabled
Random Crashing


Over the last month or so my CentOS server has been crashing for reasons I do not know. It has been running for over a year with regular yum updates without problems. The load on the server is perfectly normal with CPU usage at 5-6% and RAM usage at less than half of 32GB of RAM (multiple smaller game servers run off of this box). I am unsure if this is a software issue at all.

I have pasted my /var/log/messages file around the time of my latest crash all the way up to the crash. Because I am a CentOS newb, this is gibberish to me, so I am curious if anything in the file points to a crash of some kind? Or if there are other logs I could check and paste? If not, it would lead me to believe there is a hardware issue or overheating.

Here is the messages:

Code:
Dec 21 14:58:03 server1 kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26d/0x280() (Not tainted)
Dec 21 14:58:03 server1 kernel: Hardware name: X9SCL/X9SCM
Dec 21 14:58:03 server1 kernel: NETDEV WATCHDOG: eth2 (e1000e): transmit queue 0 timed out
Dec 21 14:58:03 server1 kernel: Modules linked in: fuse autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 sg microcode serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e1000e ext4 mbcache jbd2 sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Dec 21 14:58:03 server1 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-279.14.1.el6.x86_64 #1
Dec 21 14:58:03 server1 kernel: Call Trace:
Dec 21 14:58:03 server1 kernel: <IRQ>  [<ffffffff8106b7b7>] ? warn_slowpath_common+0x87/0xc0
Dec 21 14:58:03 server1 kernel: [<ffffffff8106b8a6>] ? warn_slowpath_fmt+0x46/0x50
Dec 21 14:58:03 server1 kernel: [<ffffffff81459c0d>] ? dev_watchdog+0x26d/0x280
Dec 21 14:58:03 server1 kernel: [<ffffffff8108caad>] ? insert_work+0x6d/0xb0
Dec 21 14:58:03 server1 kernel: [<ffffffff814599a0>] ? dev_watchdog+0x0/0x280
Dec 21 14:58:03 server1 kernel: [<ffffffff8107e937>] ? run_timer_softirq+0x197/0x340
Dec 21 14:58:03 server1 kernel: [<ffffffff810a23c0>] ? tick_sched_timer+0x0/0xc0
Dec 21 14:58:03 server1 kernel: [<ffffffff8102b40d>] ? lapic_next_event+0x1d/0x30
Dec 21 14:58:03 server1 kernel: [<ffffffff81073f61>] ? __do_softirq+0xc1/0x1e0
Dec 21 14:58:03 server1 kernel: [<ffffffff81096d60>] ? hrtimer_interrupt+0x140/0x250
Dec 21 14:58:03 server1 kernel: [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
Dec 21 14:58:03 server1 kernel: [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
Dec 21 14:58:03 server1 kernel: [<ffffffff81073d45>] ? irq_exit+0x85/0x90
Dec 21 14:58:03 server1 kernel: [<ffffffff81506450>] ? smp_apic_timer_interrupt+0x70/0x9b
Dec 21 14:58:03 server1 kernel: [<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20
Dec 21 14:58:03 server1 kernel: <EOI>  [<ffffffff812cddbe>] ? intel_idle+0xde/0x170
Dec 21 14:58:03 server1 kernel: [<ffffffff812cdda1>] ? intel_idle+0xc1/0x170
Dec 21 14:58:03 server1 kernel: [<ffffffff8109929d>] ? sched_clock_cpu+0xcd/0x110
Dec 21 14:58:03 server1 kernel: [<ffffffff81407c27>] ? cpuidle_idle_call+0xa7/0x140
Dec 21 14:58:03 server1 kernel: [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
Dec 21 14:58:03 server1 kernel: [<ffffffff814f754f>] ? start_secondary+0x22a/0x26d
Dec 21 14:58:03 server1 kernel: ---[ end trace c6b419e0a29214c3 ]---
Dec 21 14:58:03 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:03 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:03 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:58:04 server1 abrtd: Directory 'oops-2012-12-21-14:58:04-2219-0' creation detected
Dec 21 14:58:04 server1 abrt-dump-oops: Reported 1 kernel oopses to Abrt
Dec 21 14:58:04 server1 abrtd: Can't open file '/var/spool/abrt/oops-2012-12-21-14:58:04-2219-0/uid': No such file or directory
Dec 21 14:58:06 server1 kernel: Bridge firewalling registered
Dec 21 14:58:13 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:13 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:13 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:58:14 server1 abrtd: Sending an email...
Dec 21 14:58:14 server1 abrtd: Email was sent to: root@localhost
Dec 21 14:58:14 server1 abrtd: New problem directory /var/spool/abrt/oops-2012-12-21-14:58:04-2219-0, processing
Dec 21 14:58:14 server1 abrtd: Can't open file '/var/spool/abrt/oops-2012-12-21-14:58:04-2219-0/uid': No such file or directory
Dec 21 14:58:23 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:23 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:23 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:58:33 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:33 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:33 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:58:43 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:43 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:43 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:58:53 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:53 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:53 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:59:03 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:59:03 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:59:03 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:59:13 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:59:13 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:59:13 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 15:03:03 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 15:03:03 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 15:03:03 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Thanks in advance.
 
Old 12-23-2012, 12:33 PM   #2
TobiSGD
Moderator
 
Registered: Dec 2009
Location: Germany
Distribution: Whatever fits the task best
Posts: 17,148
Blog Entries: 2

Rep: Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886Reputation: 4886
This seems to be an hardware error, it looks like one of the network adapters is faulty.
 
Old 12-23-2012, 05:18 PM   #3
secretservgy
Member
 
Registered: Jul 2006
Location: New York, USA
Distribution: kUbuntu 12.04.1 LTS, Debian, Whiite-Linux
Posts: 50

Rep: Reputation: 19
This seems to be a known CentOS bug talked about on their bug boards.

The problem isn't the hardware itself, but the kmod-e1000e driver.

Here is a supposed fix including instructions using a newer version of the same driver:

Resolved Intel e1000e Driver Bug on 825751 Ethernet Controller
 
Old 12-24-2012, 06:09 AM   #4
H_TeXMeX_H
LQ Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301
I have also had issues with the e1000e driver, it is quite buggy. Try updating it like secretservgy suggests. For me it caused random hangs on shutdown.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Random, very odd behaviour crashing rsync switch10 Linux - Desktop 4 04-14-2010 01:19 AM
Random apps and kde/gnome components crashing auto logouts all occurring randomly. The_Trooper Fedora 2 01-07-2008 04:43 PM
Apps (and X) crashing at random on FC6 Core Duo zeno0771 Linux - Desktop 10 03-13-2007 11:44 AM
Random web browser crashing forbes Linux - Software 0 03-02-2005 12:05 AM
Mandrake 10 nforce random crashing, drivers of death, and much uncoolness squishypickle Mandriva 2 06-03-2004 05:14 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 10:03 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration