LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 09-12-2011, 07:49 AM   #1
ckozler
LQ Newbie
 
Registered: Nov 2010
Posts: 5

Rep: Reputation: 0
Random Reboots (+ Linux Coding Question)


Hi Guys,

This is a two part question- the second question was derived from the issue I am having as outlined in the first question

First question - I am running a Dell R610 server with CentOS 5.6 with a Xen kernel
Code:
[08:39:16][root@virtualmaster-01:~]$ uname -a
Linux virtualmaster-01.mydomain.com 2.6.18-238.19.1.el5xen #1 SMP Fri Jul 15 08:16:59 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
Twice now I have seen a random reboot of this server on a weekend- once on a Sunday in the early hours and then again this past Saturday at around 3 AM.

Code:
[08:40:05][root@virtualmaster-01:~]$ last -n 10 
root     pts/7        192.168.1.211    Mon Sep 12 08:39   still logged in   
{...}   
reboot   system boot  2.6.18-238.19.1. Sat Sep 10 03:57         (2+04:43)
My question is- is there any way I can find out what caused this? I see no logs about a kernel panic, there was no power failure as this server is collocated with about 8 other servers and a NAS device and all of which did not report a similar issue. This server also has dual power supplies and more recently just received a memory upgrade about 2 weeks ago- up until this point though, nothing happened.

What interests me though is that last shows it as a reboot and not as a crash. /var/log/messages* does not show that it was a command-initiated reboot (eg: reboot executed at the command line) and only shows logs from the subsequent boot

Any ideas? Anything else I could supply that could give you anymore details? I do not have automatic installation of updates going either...I do that manually.

Also should note I am the only one with access to this box, both physically and remotely.

My second question:

Could anyone point me into the direction of how to hook Linux functions and system calls? For instance, I want to write a small driver/daemon utility that will catch a reboot or a shutdown - whether user executed or a crash/system reboot like the one depicted here- so I can figure out my own way to handle it. I probably wont utilize it on any production system and I am more curious as to how to do it so I can rehash my C/C++ knowledge. Any pointers/directions or tutorials I could look at would be a great help, thanks!
 
Old 09-12-2011, 08:18 AM   #2
zackwasa
Member
 
Registered: Sep 2011
Posts: 52

Rep: Reputation: Disabled
Try this script to log thing that runs on the server each minute. I hope it will help you determine the cause:
Code:
#!/bin/sh

mkdir -p /var/log/monitor
date=`date +%H_%M`

top -b -n 1 > /var/log/monitor/top_$date
netstat -anp > /var/log/monitor/netstat_$date
ps -efww > /var/log/monitor/ps_$date
free > /var/log/monitor/free_$date
w > /var/log/monitor/w_$date
Put it as cron to run each minute

RMI

Last edited by zackwasa; 01-12-2012 at 12:55 AM.
 
Old 09-12-2011, 08:45 AM   #3
ckozler
LQ Newbie
 
Registered: Nov 2010
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by zackwasa View Post
Try this script to log thing that runs on the server each minute. I hope it will help you determine the cause:
Code:
#!/bin/sh

mkdir -p /var/log/monitor
date=`date +%H_%M`

top -b -n 1 > /var/log/monitor/top_$date
netstat -anp > /var/log/monitor/netstat_$date
ps -efww > /var/log/monitor/ps_$date
free > /var/log/monitor/free_$date
w > /var/log/monitor/w_$date
Put it as cron to run each minute

I was already leaning on towards implementing something like this but Im not sure this will fix the issue. I have a tiny inclination that it may be a failing/failed Xen driver or something that is calling init 6 and executing a reboot. While the chance of it being a hardware failure is there, it is a significantly small chance and I believe it may be something software related.

That being said, cronjobs may not be the answer. I have one already that defines @reboot to email me and take a snapshot of all running processes and network connections as reboot is called. This could be a temporary fix but if the system is actually crashing and its the result of a failed driver, I have a feeling cron may never get called unless it too has hooks into init.

Let me know what you think and thank you for your input!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Random reboots Quads Linux - Newbie 4 07-05-2009 03:10 AM
Random Reboots n9066r Linux - Server 14 12-24-2008 10:57 AM
Random unexplained hard reboots! (not specificly a linux problem) Cold Coffee Linux - Hardware 2 03-06-2008 04:46 PM
C++ coding style question re: random access iterators spursrule Programming 10 03-03-2008 09:21 PM
random reboots rclawson Mandriva 3 10-26-2003 08:09 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 07:23 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration