Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Over the last month or so my CentOS server has been crashing for reasons I do not know. It has been running for over a year with regular yum updates without problems. The load on the server is perfectly normal with CPU usage at 5-6% and RAM usage at less than half of 32GB of RAM (multiple smaller game servers run off of this box). I am unsure if this is a software issue at all.
I have pasted my /var/log/messages file around the time of my latest crash all the way up to the crash. Because I am a CentOS newb, this is gibberish to me, so I am curious if anything in the file points to a crash of some kind? Or if there are other logs I could check and paste? If not, it would lead me to believe there is a hardware issue or overheating.
Here is the messages:
Code:
Dec 21 14:58:03 server1 kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26d/0x280() (Not tainted)
Dec 21 14:58:03 server1 kernel: Hardware name: X9SCL/X9SCM
Dec 21 14:58:03 server1 kernel: NETDEV WATCHDOG: eth2 (e1000e): transmit queue 0 timed out
Dec 21 14:58:03 server1 kernel: Modules linked in: fuse autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 sg microcode serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e1000e ext4 mbcache jbd2 sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Dec 21 14:58:03 server1 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-279.14.1.el6.x86_64 #1
Dec 21 14:58:03 server1 kernel: Call Trace:
Dec 21 14:58:03 server1 kernel: <IRQ> [<ffffffff8106b7b7>] ? warn_slowpath_common+0x87/0xc0
Dec 21 14:58:03 server1 kernel: [<ffffffff8106b8a6>] ? warn_slowpath_fmt+0x46/0x50
Dec 21 14:58:03 server1 kernel: [<ffffffff81459c0d>] ? dev_watchdog+0x26d/0x280
Dec 21 14:58:03 server1 kernel: [<ffffffff8108caad>] ? insert_work+0x6d/0xb0
Dec 21 14:58:03 server1 kernel: [<ffffffff814599a0>] ? dev_watchdog+0x0/0x280
Dec 21 14:58:03 server1 kernel: [<ffffffff8107e937>] ? run_timer_softirq+0x197/0x340
Dec 21 14:58:03 server1 kernel: [<ffffffff810a23c0>] ? tick_sched_timer+0x0/0xc0
Dec 21 14:58:03 server1 kernel: [<ffffffff8102b40d>] ? lapic_next_event+0x1d/0x30
Dec 21 14:58:03 server1 kernel: [<ffffffff81073f61>] ? __do_softirq+0xc1/0x1e0
Dec 21 14:58:03 server1 kernel: [<ffffffff81096d60>] ? hrtimer_interrupt+0x140/0x250
Dec 21 14:58:03 server1 kernel: [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
Dec 21 14:58:03 server1 kernel: [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
Dec 21 14:58:03 server1 kernel: [<ffffffff81073d45>] ? irq_exit+0x85/0x90
Dec 21 14:58:03 server1 kernel: [<ffffffff81506450>] ? smp_apic_timer_interrupt+0x70/0x9b
Dec 21 14:58:03 server1 kernel: [<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20
Dec 21 14:58:03 server1 kernel: <EOI> [<ffffffff812cddbe>] ? intel_idle+0xde/0x170
Dec 21 14:58:03 server1 kernel: [<ffffffff812cdda1>] ? intel_idle+0xc1/0x170
Dec 21 14:58:03 server1 kernel: [<ffffffff8109929d>] ? sched_clock_cpu+0xcd/0x110
Dec 21 14:58:03 server1 kernel: [<ffffffff81407c27>] ? cpuidle_idle_call+0xa7/0x140
Dec 21 14:58:03 server1 kernel: [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
Dec 21 14:58:03 server1 kernel: [<ffffffff814f754f>] ? start_secondary+0x22a/0x26d
Dec 21 14:58:03 server1 kernel: ---[ end trace c6b419e0a29214c3 ]---
Dec 21 14:58:03 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:03 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:03 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:58:04 server1 abrtd: Directory 'oops-2012-12-21-14:58:04-2219-0' creation detected
Dec 21 14:58:04 server1 abrt-dump-oops: Reported 1 kernel oopses to Abrt
Dec 21 14:58:04 server1 abrtd: Can't open file '/var/spool/abrt/oops-2012-12-21-14:58:04-2219-0/uid': No such file or directory
Dec 21 14:58:06 server1 kernel: Bridge firewalling registered
Dec 21 14:58:13 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:13 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:13 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:58:14 server1 abrtd: Sending an email...
Dec 21 14:58:14 server1 abrtd: Email was sent to: root@localhost
Dec 21 14:58:14 server1 abrtd: New problem directory /var/spool/abrt/oops-2012-12-21-14:58:04-2219-0, processing
Dec 21 14:58:14 server1 abrtd: Can't open file '/var/spool/abrt/oops-2012-12-21-14:58:04-2219-0/uid': No such file or directory
Dec 21 14:58:23 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:23 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:23 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:58:33 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:33 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:33 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:58:43 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:43 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:43 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:58:53 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:58:53 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:58:53 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:59:03 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:59:03 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:59:03 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 14:59:13 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 14:59:13 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 14:59:13 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Dec 21 15:03:03 server1 kernel: e1000e 0000:02:00.0: eth2: Reset adapter
Dec 21 15:03:03 server1 kernel: e1000e 0000:02:00.0: eth2: Error reading PHY register
Dec 21 15:03:03 server1 kernel: e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
I have also had issues with the e1000e driver, it is quite buggy. Try updating it like secretservgy suggests. For me it caused random hangs on shutdown.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.