Quote:
Originally posted by h/w
a paging request failure? did they change/(add/remove) the ram or some out there? they do it sometimes, and maybe the one they popped in might not be ok?
|
I'm sure they haven't changed anything - but I *really* wish they would. I'm pretty convinced that it's a hardware error.
Quote:
i dont know how much decoding the oops msg will help if that is the case, but do you know how to? you basically have to check ur system.map file for the symbol at 'c01187bb'. u might have to find the offset in case u dont see a symbol corresponding to c01187bb. then find the offending lines inside it.
|
The nearest matches in my system map are:
c0118770 T wake_up_state
c0118790 T wake_up_forked_process
c0118880 T sched_exit
c01188e0 T schedule_tail
I've actually noticed another Oops in my log, from a few days ago. The error message is different, but I see the same c01187bb as above:
Feb 17 13:36:11 sm12311 kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000060
Feb 17 13:36:11 sm12311 kernel: e081f20e
Feb 17 13:36:11 sm12311 kernel: *pde = 00000000
Feb 17 13:36:11 sm12311 kernel: Oops: 0002
Feb 17 13:36:11 sm12311 kernel: CPU: 0
Feb 17 13:36:11 sm12311 kernel: EIP: 0060:[<e081f20e>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Feb 17 13:36:11 sm12311 kernel: EFLAGS: 00010282
Feb 17 13:36:11 sm12311 kernel: eax: 00000000 ebx: 00000000 ecx: c255a2f4 edx: 00000000
Feb 17 13:36:11 sm12311 kernel: esi: 00000001 edi: dbe33880 ebp: 00000000 esp: cb59fdc8
Feb 17 13:36:11 sm12311 kernel: ds: 0068 es: 0068 ss: 0068
Feb 17 13:36:11 sm12311 kernel: Process mysqld (pid: 2012, stackpage=cb59f000)
Feb 17 13:36:11 sm12311 kernel: Stack: cb59fe18 00000001 d0925704 d0925580 e081f5f1 dbe33880 d0925580 00000528
Feb 17 13:36:11 sm12311 kernel: cb59fe00 cb59fe18 00000001 00000003 0001913f 00000000 d09256c8 00018d1d
Feb 17 13:36:11 sm12311 kernel: 00000000 d3379410 00019122 d2e26c54 d2e4c870 0001913f d2e303bc 00000400
Feb 17 13:36:11 sm12311 kernel: Call Trace: [<e081f5f1>] ext3_get_block_handle [ext3] 0x251 (0xcb59fdd8))
Feb 17 13:36:11 sm12311 kernel: [<c0149b55>] get_unused_buffer_head [kernel] 0x65 (0xcb59fe2c))
Feb 17 13:36:11 sm12311 kernel: [<e081f6aa>] ext3_get_block [ext3] 0x4a (0xcb59fe50))
Feb 17 13:36:11 sm12311 kernel: [<c014a423>] __block_prepare_write [kernel] 0x193 (0xcb59fe70))
Warning (Oops_read): Code line not seen, dumping what data is available
>>EIP; c01187bb <wake_up_forked_process+2b/f0> <=====
>>eax; f2e5b214 <END_OF_CODE+124c68a9/????>
>>ebx; d8244000 <_end+17e64e80/2042dee0>
>>ecx; 51eb851f Before first symbol
>>edx; 00000ea1 Before first symbol
>>esi; d2e42000 <_end+12a62e80/2042dee0>
>>ebp; d8245f54 <_end+17e66dd4/2042dee0>
>>esp; d8245f4c <_end+17e66dcc/2042dee0>
Trace; c011b479 <do_fork+99/140>
Trace; c0127d92 <sys_rt_sigprocmask+f2/160>
Trace; c0107b39 <sys_clone+49/70>
Trace; c010953f <system_call+33/38>
Code; c01187bb <wake_up_forked_process+2b/f0>
00000000 <_EIP>:
Code; c01187bb <wake_up_forked_process+2b/f0> <=====
0: 02 89 d0 f7 e1 c1 add 0xc1e1f7d0(%ecx),%cl <=====
Code; c01187c1 <wake_up_forked_process+31/f0>
6: ea 05 89 53 38 8b 56 ljmp $0x568b,$0x38538905
Code; c01187c8 <wake_up_forked_process+38/f0>
d: 38 8d 14 92 8d 14 cmp %cl,0x148d9214(%ebp)
Code; c01187ce <wake_up_forked_process+3e/f0>
13: 92 xchg %eax,%edx
I hate not running my own hardware...but their bandwidth is so cheap...