LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices



Reply
 
Search this Thread
Old 02-19-2004, 10:02 AM   #1
quill18
LQ Newbie
 
Registered: Feb 2004
Posts: 7

Rep: Reputation: 0
Kernel Oops - System also hangs under load.


Hi everyone, here's the situation:

Two days ago I signed up for a dedicated web host. Everything looked great on the server and testing revealed no problems, so I moved my busy (250,000 hits/day) website to the machine. Everything worked great for a couple hours, then everything hung. I put in a support request to have the machine rebooted, but the same thing happens every time I set it live (I can't get the machine to hang with anything I do myself.) The system is Redhat 9.

My provider swears that they've check the hardware. I don't actually believe but...

Anyway - I don't have access to the local console, so I can't vouch for any message that might be there. The logs don't usually reveal anything, but there are 2 Oops entries in there which did not produce a crash. Here's one:

ksymoops 2.4.5 on i686 2.4.20-28.9. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.20-28.9/ (default)
-m /boot/System.map-2.4.20-28.9 (default)

Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.

Error (expand_objects): cannot stat(/lib/ext3.o) for ext3
Error (expand_objects): cannot stat(/lib/jbd.o) for jbd
Error (pclose_local): find_objects pclose failed 0x100
Warning (map_ksym_to_module): cannot match loaded module ext3 to a unique module object. Trace may not be reliable.
Feb 19 04:03:25 sm12311 kernel: Unable to handle kernel paging request at virtual address 13cd7cef
Feb 19 04:03:25 sm12311 kernel: c01187bb
Feb 19 04:03:25 sm12311 kernel: *pde = 00000000
Feb 19 04:03:25 sm12311 kernel: Oops: 0000
Feb 19 04:03:25 sm12311 kernel: CPU: 0
Feb 19 04:03:25 sm12311 kernel: EIP: 0060:[<c01187bb>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Feb 19 04:03:25 sm12311 kernel: EFLAGS: 00010086
Feb 19 04:03:25 sm12311 kernel: eax: f2e5b214 ebx: d8244000 ecx: 51eb851f edx: 00000ea1
Feb 19 04:03:25 sm12311 kernel: esi: d2e42000 edi: 00000000 ebp: d8245f54 esp: d8245f4c
Feb 19 04:03:25 sm12311 kernel: ds: 0068 es: 0068 ss: 0068
Feb 19 04:03:25 sm12311 kernel: Process logrotate (pid: 1241, stackpage=d8245000)
Feb 19 04:03:25 sm12311 kernel: Stack: d2e42000 00000000 bfffe3c8 c011b479 01200011 bfffe358 d8245fc4 00000000
Feb 19 04:03:25 sm12311 kernel: 00000000 400270c8 00000028 00000066 080e9830 00000000 fffffff2 00000000
Feb 19 04:03:25 sm12311 kernel: bfffe3e0 c0127d92 00000000 00000000 400270c8 c0107b39 01200011 bfffe358
Feb 19 04:03:25 sm12311 kernel: Call Trace: [<c011b479>] do_fork [kernel] 0x99 (0xd8245f58))
Feb 19 04:03:25 sm12311 kernel: [<c0127d92>] sys_rt_sigprocmask [kernel] 0xf2 (0xd8245f90))
Feb 19 04:03:25 sm12311 kernel: [<c0107b39>] sys_clone [kernel] 0x49 (0xd8245fa0))
Feb 19 04:03:25 sm12311 kernel: [<c010953f>] system_call [kernel] 0x33 (0xd8245fc0))
Feb 19 04:03:25 sm12311 kernel: Code: 02 89 d0 f7 e1 c1 ea 05 89 53 38 8b 56 38 8d 14 92 8d 14 92



>>EIP; c01187bb <wake_up_forked_process+2b/f0> <=====

>>eax; f2e5b214 <END_OF_CODE+124c68a9/????>
>>ebx; d8244000 <_end+17e64e80/2042dee0>
>>ecx; 51eb851f Before first symbol
>>edx; 00000ea1 Before first symbol
>>esi; d2e42000 <_end+12a62e80/2042dee0>
>>ebp; d8245f54 <_end+17e66dd4/2042dee0>
>>esp; d8245f4c <_end+17e66dcc/2042dee0>

Trace; c011b479 <do_fork+99/140>
Trace; c0127d92 <sys_rt_sigprocmask+f2/160>
Trace; c0107b39 <sys_clone+49/70>
Trace; c010953f <system_call+33/38>

Code; c01187bb <wake_up_forked_process+2b/f0>
00000000 <_EIP>:
Code; c01187bb <wake_up_forked_process+2b/f0> <=====
0: 02 89 d0 f7 e1 c1 add 0xc1e1f7d0(%ecx),%cl <=====
Code; c01187c1 <wake_up_forked_process+31/f0>
6: ea 05 89 53 38 8b 56 ljmp $0x568b,$0x38538905
Code; c01187c8 <wake_up_forked_process+38/f0>
d: 38 8d 14 92 8d 14 cmp %cl,0x148d9214(%ebp)
Code; c01187ce <wake_up_forked_process+3e/f0>
13: 92 xchg %eax,%edx


2 warnings and 3 errors issued. Results may not be reliable.
 
Old 02-19-2004, 04:22 PM   #2
h/w
Senior Member
 
Registered: Mar 2003
Location: New York, NY
Distribution: Debian Testing
Posts: 1,286

Rep: Reputation: 45
a paging request failure? did they change/(add/remove) the ram or some out there? they do it sometimes, and maybe the one they popped in might not be ok?

i dont know how much decoding the oops msg will help if that is the case, but do you know how to? you basically have to check ur system.map file for the symbol at 'c01187bb'. u might have to find the offset in case u dont see a symbol corresponding to c01187bb. then find the offending lines inside it.

dont think i have been of much help here.
 
Old 02-19-2004, 04:50 PM   #3
quill18
LQ Newbie
 
Registered: Feb 2004
Posts: 7

Original Poster
Rep: Reputation: 0
Quote:
Originally posted by h/w
a paging request failure? did they change/(add/remove) the ram or some out there? they do it sometimes, and maybe the one they popped in might not be ok?
I'm sure they haven't changed anything - but I *really* wish they would. I'm pretty convinced that it's a hardware error.

Quote:
i dont know how much decoding the oops msg will help if that is the case, but do you know how to? you basically have to check ur system.map file for the symbol at 'c01187bb'. u might have to find the offset in case u dont see a symbol corresponding to c01187bb. then find the offending lines inside it.
The nearest matches in my system map are:

c0118770 T wake_up_state
c0118790 T wake_up_forked_process
c0118880 T sched_exit
c01188e0 T schedule_tail

I've actually noticed another Oops in my log, from a few days ago. The error message is different, but I see the same c01187bb as above:

Feb 17 13:36:11 sm12311 kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000060
Feb 17 13:36:11 sm12311 kernel: e081f20e
Feb 17 13:36:11 sm12311 kernel: *pde = 00000000
Feb 17 13:36:11 sm12311 kernel: Oops: 0002
Feb 17 13:36:11 sm12311 kernel: CPU: 0
Feb 17 13:36:11 sm12311 kernel: EIP: 0060:[<e081f20e>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Feb 17 13:36:11 sm12311 kernel: EFLAGS: 00010282
Feb 17 13:36:11 sm12311 kernel: eax: 00000000 ebx: 00000000 ecx: c255a2f4 edx: 00000000
Feb 17 13:36:11 sm12311 kernel: esi: 00000001 edi: dbe33880 ebp: 00000000 esp: cb59fdc8
Feb 17 13:36:11 sm12311 kernel: ds: 0068 es: 0068 ss: 0068
Feb 17 13:36:11 sm12311 kernel: Process mysqld (pid: 2012, stackpage=cb59f000)
Feb 17 13:36:11 sm12311 kernel: Stack: cb59fe18 00000001 d0925704 d0925580 e081f5f1 dbe33880 d0925580 00000528
Feb 17 13:36:11 sm12311 kernel: cb59fe00 cb59fe18 00000001 00000003 0001913f 00000000 d09256c8 00018d1d
Feb 17 13:36:11 sm12311 kernel: 00000000 d3379410 00019122 d2e26c54 d2e4c870 0001913f d2e303bc 00000400
Feb 17 13:36:11 sm12311 kernel: Call Trace: [<e081f5f1>] ext3_get_block_handle [ext3] 0x251 (0xcb59fdd8))
Feb 17 13:36:11 sm12311 kernel: [<c0149b55>] get_unused_buffer_head [kernel] 0x65 (0xcb59fe2c))
Feb 17 13:36:11 sm12311 kernel: [<e081f6aa>] ext3_get_block [ext3] 0x4a (0xcb59fe50))
Feb 17 13:36:11 sm12311 kernel: [<c014a423>] __block_prepare_write [kernel] 0x193 (0xcb59fe70))
Warning (Oops_read): Code line not seen, dumping what data is available

>>EIP; c01187bb <wake_up_forked_process+2b/f0> <=====

>>eax; f2e5b214 <END_OF_CODE+124c68a9/????>
>>ebx; d8244000 <_end+17e64e80/2042dee0>
>>ecx; 51eb851f Before first symbol
>>edx; 00000ea1 Before first symbol
>>esi; d2e42000 <_end+12a62e80/2042dee0>
>>ebp; d8245f54 <_end+17e66dd4/2042dee0>
>>esp; d8245f4c <_end+17e66dcc/2042dee0>

Trace; c011b479 <do_fork+99/140>
Trace; c0127d92 <sys_rt_sigprocmask+f2/160>
Trace; c0107b39 <sys_clone+49/70>
Trace; c010953f <system_call+33/38>

Code; c01187bb <wake_up_forked_process+2b/f0>
00000000 <_EIP>:
Code; c01187bb <wake_up_forked_process+2b/f0> <=====
0: 02 89 d0 f7 e1 c1 add 0xc1e1f7d0(%ecx),%cl <=====
Code; c01187c1 <wake_up_forked_process+31/f0>
6: ea 05 89 53 38 8b 56 ljmp $0x568b,$0x38538905
Code; c01187c8 <wake_up_forked_process+38/f0>
d: 38 8d 14 92 8d 14 cmp %cl,0x148d9214(%ebp)
Code; c01187ce <wake_up_forked_process+3e/f0>
13: 92 xchg %eax,%edx

I hate not running my own hardware...but their bandwidth is so cheap...
 
Old 02-19-2004, 05:17 PM   #4
h/w
Senior Member
 
Registered: Mar 2003
Location: New York, NY
Distribution: Debian Testing
Posts: 1,286

Rep: Reputation: 45
<wake_up_forked_process>? issues with the system scheduler then eh? apart from agreeing to what u said (h/w issue), i dont know what's going wrong here now as i have not seen this before.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
kernel errors, system hangs baronsam Linux - Software 6 05-13-2005 12:37 PM
system boot hangs after installing kernel-ntfs urkhaa1975 Fedora 1 08-09-2004 11:17 AM
MDK10 Official - new kernel & system hangs kingprad Mandriva 1 06-19-2004 08:02 AM
oops! took out second hard disk--fs fails to load! sick-o-windoze Linux - Hardware 1 11-29-2003 10:54 PM
system hangs after confuguring kernel shanmugapriyan Linux - Software 0 08-07-2003 06:52 AM


All times are GMT -5. The time now is 06:13 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration