I'm running CentOS 7.1.1503 on a home-built box with an i3 processor. The box is a file server and has several internal mdadm RAID arrays...one for root, one for a small file share, and one for a large file share.
I've seen a number of spontaneous reboots:
Code:
127.0.0.1-2015.08.16-01:28:16/vmcore-dmesg.txt:[562354.563966] BUG: unable to handle kernel paging request at 000000000000212a
127.0.0.1-2015.08.27-07:12:31/vmcore-dmesg.txt:[969889.931877] BUG: unable to handle kernel paging request at ffff880112268b60
127.0.0.1-2015.08.28-16:52:55/vmcore-dmesg.txt:[ 4611.944684] kernel BUG at drivers/md/raid5.c:316!
127.0.0.1-2015.08.28-19:21:10/vmcore-dmesg.txt:[ 8833.527255] BUG: unable to handle kernel paging request at 000000020000039f
Some more detail on the last one:
Code:
[ 8833.527255] BUG: unable to handle kernel paging request at 000000020000039f
[ 8833.527293] IP: [<ffffffff81208d2a>] bio_integrity_advance+0x1a/0x60
[ 8833.527320] PGD 0
[ 8833.527327] Oops: 0000 [#1] SMP
[ 8833.527338] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dm_mirror dm_region_hash dm_log dm_mod nfsd intel_powerclamp coretemp eeepc_wmi asus_wmi sparse_keymap raid456 async_raid6_recov async_memcpy async_pq intel_rapl kvm_intel raid6_pq rfkill kvm iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek mxm_wmi snd_hda_codec_hdmi snd_hda_codec_generic crc32c_intel ghash_clmulni_intel snd_hda_intel snd_hda_controller snd_hda_codec aesni_intel snd_hwdep auth_rpcgss nfs_acl lockd mei_me async_xor snd_seq snd_seq_device xor async_tx lrw gf128mul shpchp wmi snd_pcm mei glue_helper ablk_helper cryptd lpc_ich mfd_core pcspkr serio_raw i2c_i801 snd_timer snd soundcore tpm_infineon sunrpc uinput ext4 mbcache jbd2 raid1 sd_mod crc_t10dif crct10dif_common
[ 8833.527606] i915 ahci libahci libata i2c_algo_bit drm_kms_helper e1000e drm ptp pps_core i2c_core video
[ 8833.527644] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.10.0-229.11.1.el7.x86_64 #1
[ 8833.527667] Hardware name: ASUS All Series/Z87-PLUS, BIOS 1405 08/19/2013
[ 8833.527686] task: ffff88030e8b6660 ti: ffff88030e8e4000 task.ti: ffff88030e8e4000
[ 8833.527704] RIP: 0010:[<ffffffff81208d2a>] [<ffffffff81208d2a>] bio_integrity_advance+0x1a/0x60
[ 8833.527736] RSP: 0018:ffff88031fb83cf0 EFLAGS: 00010202
[ 8833.527752] RAX: 00000001ffffffff RBX: 0000000000006000 RCX: 0000000000000003
[ 8833.527770] RDX: 0000000000000000 RSI: 0000000000006000 RDI: 00000001fb3f2b10
[ 8833.527790] RBP: ffff88031fb83d08 R08: 0000000000000001 R09: 00000000000002c0
[ 8833.527809] R10: ffff88030aa9a800 R11: 0000000000080000 R12: ffff88001c6c5c58
[ 8833.527828] R13: 00000000fffffffb R14: 0000000000006000 R15: ffff880131e9ac00
[ 8833.527846] FS: 0000000000000000(0000) GS:ffff88031fb80000(0000) knlGS:0000000000000000
[ 8833.527865] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8833.527880] CR2: 000000020000039f CR3: 000000000190a000 CR4: 00000000001407e0
[ 8833.527897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8833.527916] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 8833.527934] Stack:
[ 8833.527941] ffffffff811fe09d ffff88001c6c5c58 0000000000006000 ffff88031fb83d48
[ 8833.527968] ffffffff812ad447 0007a00000000000 ffff880131e9ac00 0000000000000000
[ 8833.527990] 0000000000000000 0000000000000000 ffff880131e9ac00 ffff88031fb83d70
[ 8833.528014] Call Trace:
[ 8833.528023] <IRQ>
[ 8833.528029]
[ 8833.528042] [<ffffffff811fe09d>] ? bio_advance+0x1d/0xd0
[ 8833.528063] [<ffffffff812ad447>] blk_update_request+0x77/0x350
[ 8833.528083] [<ffffffff812ad73c>] blk_update_bidi_request+0x1c/0x80
[ 8833.528101] [<ffffffff812ada1f>] blk_end_bidi_request+0x1f/0x60
[ 8833.528121] [<ffffffff812ada70>] blk_end_request+0x10/0x20
[ 8833.528142] [<ffffffff813f9cd8>] scsi_io_completion+0x108/0x650
[ 8833.528160] [<ffffffff813eece3>] scsi_finish_command+0xb3/0x110
[ 8833.528176] [<ffffffff813f9adf>] scsi_softirq_done+0x12f/0x160
[ 8833.528192] [<ffffffff812b3fb0>] blk_done_softirq+0x90/0xc0
[ 8833.528208] [<ffffffff81077b2f>] __do_softirq+0xef/0x280
[ 8833.528223] [<ffffffff81615b9c>] call_softirq+0x1c/0x30
[ 8833.528239] [<ffffffff81015d95>] do_softirq+0x65/0xa0
[ 8833.528252] [<ffffffff81077ec5>] irq_exit+0x115/0x120
[ 8833.528267] [<ffffffff81616738>] do_IRQ+0x58/0xf0
[ 8833.528284] [<ffffffff8160b9ed>] common_interrupt+0x6d/0x6d
[ 8833.528299] <EOI>
[ 8833.527255] BUG: unable to handle kernel paging request at 000000020000039f
[ 8833.527293] IP: [<ffffffff81208d2a>] bio_integrity_advance+0x1a/0x60
[ 8833.527320] PGD 0
[ 8833.527327] Oops: 0000 [#1] SMP
[ 8833.527338] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dm_mirror dm_region_hash dm_log dm_mod nfsd intel_powerclamp coretemp eeepc_wmi asus_wmi sparse_keymap raid456 async_raid6_recov async_memcpy async_pq intel_rapl kvm_intel raid6_pq rfkill kvm iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek mxm_wmi snd_hda_codec_hdmi snd_hda_codec_generic crc32c_intel ghash_clmulni_intel snd_hda_intel snd_hda_controller snd_hda_codec aesni_intel snd_hwdep auth_rpcgss nfs_acl lockd mei_me async_xor snd_seq snd_seq_device xor async_tx lrw gf128mul shpchp wmi snd_pcm mei glue_helper ablk_helper cryptd lpc_ich mfd_core pcspkr serio_raw i2c_i801 snd_timer snd soundcore tpm_infineon sunrpc uinput ext4 mbcache jbd2 raid1 sd_mod crc_t10dif crct10dif_common
[ 8833.527606] i915 ahci libahci libata i2c_algo_bit drm_kms_helper e1000e drm ptp pps_core i2c_core video
[ 8833.527644] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.10.0-229.11.1.el7.x86_64 #1
[ 8833.527667] Hardware name: ASUS All Series/Z87-PLUS, BIOS 1405 08/19/2013
[ 8833.527686] task: ffff88030e8b6660 ti: ffff88030e8e4000 task.ti: ffff88030e8e4000
[ 8833.527704] RIP: 0010:[<ffffffff81208d2a>] [<ffffffff81208d2a>] bio_integrity_advance+0x1a/0x60
[ 8833.527736] RSP: 0018:ffff88031fb83cf0 EFLAGS: 00010202
[ 8833.527752] RAX: 00000001ffffffff RBX: 0000000000006000 RCX: 0000000000000003
[ 8833.527770] RDX: 0000000000000000 RSI: 0000000000006000 RDI: 00000001fb3f2b10
[ 8833.527790] RBP: ffff88031fb83d08 R08: 0000000000000001 R09: 00000000000002c0
[ 8833.527809] R10: ffff88030aa9a800 R11: 0000000000080000 R12: ffff88001c6c5c58
[ 8833.527828] R13: 00000000fffffffb R14: 0000000000006000 R15: ffff880131e9ac00
[ 8833.527846] FS: 0000000000000000(0000) GS:ffff88031fb80000(0000) knlGS:0000000000000000
[ 8833.527865] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8833.527880] CR2: 000000020000039f CR3: 000000000190a000 CR4: 00000000001407e0
[ 8833.527897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8833.527916] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 8833.527934] Stack:
[ 8833.527941] ffffffff811fe09d ffff88001c6c5c58 0000000000006000 ffff88031fb83d48
[ 8833.527968] ffffffff812ad447 0007a00000000000 ffff880131e9ac00 0000000000000000
[ 8833.527990] 0000000000000000 0000000000000000 ffff880131e9ac00 ffff88031fb83d70
[ 8833.528014] Call Trace:
[ 8833.528023] <IRQ>
[ 8833.528029]
[ 8833.528042] [<ffffffff811fe09d>] ? bio_advance+0x1d/0xd0
[ 8833.528063] [<ffffffff812ad447>] blk_update_request+0x77/0x350
[ 8833.528083] [<ffffffff812ad73c>] blk_update_bidi_request+0x1c/0x80
[ 8833.528101] [<ffffffff812ada1f>] blk_end_bidi_request+0x1f/0x60
[ 8833.528121] [<ffffffff812ada70>] blk_end_request+0x10/0x20
[ 8833.528142] [<ffffffff813f9cd8>] scsi_io_completion+0x108/0x650
[ 8833.528160] [<ffffffff813eece3>] scsi_finish_command+0xb3/0x110
[ 8833.528176] [<ffffffff813f9adf>] scsi_softirq_done+0x12f/0x160
[ 8833.528192] [<ffffffff812b3fb0>] blk_done_softirq+0x90/0xc0
[ 8833.528208] [<ffffffff81077b2f>] __do_softirq+0xef/0x280
[ 8833.528223] [<ffffffff81615b9c>] call_softirq+0x1c/0x30
[ 8833.528239] [<ffffffff81015d95>] do_softirq+0x65/0xa0
[ 8833.528252] [<ffffffff81077ec5>] irq_exit+0x115/0x120
[ 8833.528267] [<ffffffff81616738>] do_IRQ+0x58/0xf0
[ 8833.528284] [<ffffffff8160b9ed>] common_interrupt+0x6d/0x6d
[ 8833.528299] <EOI>
[ 8833.528307]
[ 8833.528318] [<ffffffff814aa022>] ? cpuidle_enter_state+0x52/0xc0
[ 8833.528333] [<ffffffff814aa018>] ? cpuidle_enter_state+0x48/0xc0
[ 8833.528352] [<ffffffff814aa155>] cpuidle_idle_call+0xc5/0x200
[ 8833.528370] [<ffffffff8101d14e>] arch_cpu_idle+0xe/0x30
[ 8833.528389] [<ffffffff810c6801>] cpu_startup_entry+0xf1/0x290
[ 8833.528410] [<ffffffff8104228a>] start_secondary+0x1ba/0x230
[ 8833.528426] Code: 08 66 89 57 28 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 8b 7f 60 48 8b 40 10 48 85 ff 48 8b 80 98 00 00 00 <48> 8b 90 a0 03 00 00 74 2a 48 85 d2 74 27 89 f0 55 c1 ee 09 c1
[ 8833.528536] RIP [<ffffffff81208d2a>] bio_integrity_advance+0x1a/0x60
[ 8833.528562] RSP <ffff88031fb83cf0>
[ 8833.528573] CR2: 000000020000039f
So most of that is greek to me but obviously it's in the I/O subsystem somewhere.
I haven't had a chance to run memtest86 but that's my next move tomorrow. But if it shows the RAM is OK...what might be the next move?
Kernel is up to date:
Code:
3.10.0-229.11.1.el7.x86_64 #1 SMP Thu Aug 6 01:06:18 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
yum shows a few things I could update (married, http, firefox, etc.) but they're all applications rather than system stuff. No kernel update available.
Running CentOS is not a requirement if I would fare better on a different distro. Really, the box just needs smb, CrashPlan, and a few minor web things with php/mysql. It's got 12GB of RAM and while it's busy (load is often 3 or 4), it's a quad core i3 and it shouldn't be oopsing like that regardless.