LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Debian (https://www.linuxquestions.org/questions/debian-26/)
-   -   Linux server crash (kernel NULL pointer dereference + soft lockup - CPU#1 stuck) (https://www.linuxquestions.org/questions/debian-26/linux-server-crash-kernel-null-pointer-dereference-soft-lockup-cpu-1-stuck-4175471538/)

Sheepa 07-30-2013 09:12 PM

Linux server crash (kernel NULL pointer dereference + soft lockup - CPU#1 stuck)
 
I have a Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux server that keeps crashing, usually once every 24-72 hours.

I run lighttpd, mysql, haproxy and a couple of always-running java processes together with a bunch of shorter-lived java processes.

Below I have linked the /var/log/syslog and /var/log/messages. They both contain the kernel NULL pointer dereference and the soft lockup bugs lines.

syslog: http://pastebin.com/7VxdkEYu
messages: http://pastebin.com/UdiN2y0d
dmesg: http://pastebin.com/8iQC5c0K

Do anyone have any idea on how to debug this?

Thanks

Dutch Master 07-31-2013 06:40 AM

I can't see any obvious errors, on the next crash, try strace
Code:

man strace
It's also worth considering hardware failure.

Sheepa 07-31-2013 01:30 PM

How can i use strace when the system have crashed?

I'm renting the server and they did a full hardware test and found no errors, besides I'm getting this on multiple servers.

Crashed again now while i got a terminal open, got this just before it died:

Code:

Message from syslogd@gs at Jul 31 19:37:37 ...
 kernel:[60109.097507] Oops: 0000 [#1] SMP

Message from syslogd@gs at Jul 31 19:37:37 ...
 kernel:[60109.098872] Stack:

Message from syslogd@gs at Jul 31 19:37:37 ...
 kernel:[60109.099163] Call Trace:

Message from syslogd@gs at Jul 31 19:37:37 ...
 kernel:[60109.099197]  <IRQ>

Message from syslogd@gs at Jul 31 19:37:37 ...
 kernel:[60109.102262]  <EOI>

Message from syslogd@gs at Jul 31 19:37:37 ...
 kernel:[60109.102267] Code: 04 01 00 00 7f a1 eb b0 58 5b 5d c3 48 81 fa 30 75 00 00 76 0c 48 c7 c0 d8 6d 2c 81 ba 30 75 00 00 83 fe 03 74 05 83 fe 01 75 21 <c8> 8b 05 12 e2 3c 00 40 88 b7 12 04 00 00 48 8d b7 50 03 00 00

Message from syslogd@gs at Jul 31 19:37:37 ...
 kernel:[60109.102442] CR2: 000000004744765c

Message from syslogd@gs at Jul 31 19:37:37 ...
 kernel:[60109.102838] Kernel panic - not syncing: Fatal exception in interrupt



All times are GMT -5. The time now is 09:21 PM.