kernel oops troubleshooting
I'm not even sure where to start, I've never had to figure one of these out before. The boss says these seem to build up and take cpus out one by one until the server dies. It started when the server was upgraded from debian 5 to debian 6. I'm getting an oops about every 8 hours. Any ideas?
[51120.420944] BUG: unable to handle kernel paging request at ffff88036840c1c0
[51120.420982] IP: [<ffffffff8123c6c9>] dma_memcpy_to_iovec+0xdd/0x145
[51120.421019] PGD 1002063 PUD 0
[51120.421047] Oops: 0000 [#2] SMP
[51120.421074] last sysfs file: /sys/devices/virtual/block/md1/md/mismatch_cnt
[51120.421105] CPU 1
[51120.421128] Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss sunrpc bonding ext3 jbd mbcache ohci_hcd snd_hda_intel snd_pcsp snd_hda_codec snd_hwdep snd_pcm snd_timer snd i2c_i801 psmouse ioatdma
soundcore serio_raw button i2c_core evdev processor dca snd_page_alloc xfs exportfs raid1 md_mod dm_mod sd_mod crc_t10dif usbhid hid ata_generic ata_piix uhci_hcd libata ehci_hcd scsi_mod usbcore e1000e nls_
base thermal thermal_sys [last unloaded: scsi_wait_scan]
[51120.421413] Pid: 23164, comm: python Tainted: G D 2.6.32-bpo.5-amd64 #1 X8DTL
[51120.421458] RIP: 0010:[<ffffffff8123c6c9>] [<ffffffff8123c6c9>] dma_memcpy_to_iovec+0xdd/0x145
[51120.421511] RSP: 0018:ffff8803295b3b48 EFLAGS: 00010202
[51120.421538] RAX: ffff88033d95a020 RBX: 0000000000000440 RCX: ffff880197e55064
[51120.421568] RDX: 0000000005556434 RSI: ffff8803295b3ed8 RDI: ffff8801bd51f818
[51120.421598] RBP: ffff8803295b3ed8 R08: 0000000000000440 R09: ffff88033d95a000
[51120.421628] R10: 00000002d6ae3f18 R11: ffff8801a8729a00 R12: 0000000000000000
[51120.421659] R13: 0000000000000bc0 R14: 000ffff805556434 R15: 00000000000005a8
[51120.421689] FS: 00007faab210f720(0000) GS:ffff880006e20000(0000) knlGS:0000000000000000
[51120.421735] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[51120.421762] CR2: ffff88036840c1c0 CR3: 000000033df62000 CR4: 00000000000006e0
[51120.421792] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[51120.421822] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[51120.421853] Process sr-python (pid: 23164, threadinfo ffff8803295b2000, task ffff88033c31dbd0)
[51120.421899] Stack:
[51120.421919] ffff88033d95a010 ffff880197e55064 ffff88033d95a000 ffff8803295b3ed8
[51120.421956] <0> ffff8801bd51f818 00000000000005a8 00000000000005a8 00000000000005a8
[51120.422013] <0> 00000000000005a8 0000000000000000 0000000000000000 ffffffff8125c533
[51120.422084] Call Trace:
[51120.422111] [<ffffffff8125c533>] ? dma_skb_copy_datagram_iovec+0x6f/0x234
[51120.422144] [<ffffffff81064ac2>] ? autoremove_wake_function+0x0/0x2e
[51120.422176] [<ffffffff8127d84d>] ? tcp_recvmsg+0x696/0xa9e
[51120.422207] [<ffffffff81241c76>] ? sock_common_recvmsg+0x30/0x45
[51120.422237] [<ffffffff8123fc98>] ? sock_aio_read+0xb9/0xc4
[51120.422267] [<ffffffff810365bb>] ? flush_tlb_page+0x5a/0x7b
[51120.422298] [<ffffffff810caeba>] ? do_wp_page+0x646/0x707
[51120.422327] [<ffffffff810ee4b9>] ? do_sync_read+0xce/0x113
[51120.422356] [<ffffffff81064ac2>] ? autoremove_wake_function+0x0/0x2e
[51120.422386] [<ffffffff810eef15>] ? vfs_read+0xb9/0xff
[51120.422413] [<ffffffff810ef017>] ? sys_read+0x45/0x6e
[51120.422441] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
[51120.422469] Code: 48 8b 0c 24 0f 4f d0 b8 00 10 00 00 48 8b 7c 24 20 44 29 e8 39 c2 0f 4e c2 49 63 d6 48 63 d8 48 8b 41 08 48 8b 4c 24 08 49 89 d8 <48> 8b 34 d0 44 89 ea e8 42 ed ff ff 85 c0 79 13 48 8b 5
c 24 20
[51120.422694] RIP [<ffffffff8123c6c9>] dma_memcpy_to_iovec+0xdd/0x145
[51120.422728] RSP <ffff8803295b3b48>
[51120.422751] CR2: ffff88036840c1c0
[51120.428200] ---[ end trace b29c463a1cbbe74e ]---
|