LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   How to find out WHY my linux box has restarted (https://www.linuxquestions.org/questions/linux-software-2/how-to-find-out-why-my-linux-box-has-restarted-899894/)

event2000 08-28-2011 05:29 AM

How to find out WHY my linux box has restarted
 
Hi,

i wanted to connect to my headless (no screen etc) linux box, but vncserver wasn't runnning. Going through the logs i found out the system started up new (dunno if reboot or something else) 2 days ago.

Now this already happened some time ago, but i didn't give it much of a thought.

But since it seems to happen more frequently I want to know whats causing this.

Its an debian squeeze box running xen4.0 with an windows vm. I also have a apc smartups1400 connected and i'm monitoring it with apcupsd.

Now apcupsd log is clean, and the xend.log just shows the start of the xen deamon on the time the system started up again.

How to find out if some program or whatever sended a reboot command?

Or if its some PC problem (i know xen has problems with some mainboard bios etc) and the pc sometimes just suddenly reboots, is there any way to see wether the system shutted down before the reboot or just suddenly started again?

Thanks in advance

GNULinuxGuy 08-28-2011 06:42 PM

If there's nothing logged, then it's extremely unlikely it was told to reboot. Possibilities in order of likelihood:

1) Hardware problem causing spontaneous reboot.
2) Security issue; buggy root kit, or someone with physical access.
3) Genuine kernel/driver bug/regression (pretty unlikely since you say it's Debian, but stranger things have happened).

Assuming you have no real computer diagnostic equipment, I would start with booting a live CD/DVD/USB stick and do some load testing. See if you can't get it to spontaneously reboot in front of you. Assuming this system was stable at one point, and you haven't just done a kernel/driver update, etc, I would say the most likely culprit is the power supply or RAM. If you suspect the former, some places will test your power supply for free. If you suspect it's the RAM, you should be able to verify this by running memtest86+ for an extended period, and then use trial and error to identify which stick is acting up.

Good luck! :)

event2000 08-29-2011 12:44 PM

If some system service, like apcupsd, did for some reason give the system a shutdown or reboot command, where would this be logged?

GNULinuxGuy 08-30-2011 01:57 AM

The pseudo user reboot logs in each time the system is told to reboot. You will see a reboot line when you use the 'last' command (or simply type 'last reboot' to see only the reboots).

cascade9 08-30-2011 02:13 AM

Quote:

Originally Posted by GNULinuxGuy (Post 4455682)
If there's nothing logged, then it's extremely unlikely it was told to reboot. Possibilities in order of likelihood:

1) Hardware problem causing spontaneous reboot.

Assuming this system was stable at one point, and you haven't just done a kernel/driver update, etc, I would say the most likely culprit is the power supply or RAM. If you suspect the former, some places will test your power supply for free. If you suspect it's the RAM, you should be able to verify this by running memtest86+ for an extended period, and then use trial and error to identify which stick is acting up.

All valid, but I'd add 'overheating CPU'.

GNULinuxGuy 08-31-2011 01:33 AM

Quote:

Originally Posted by cascade9 (Post 4456787)
All valid, but I'd add 'overheating CPU'.

Indeed this is a possibility I probably should have mentioned, however in most instances where the CPU is overheating, the BIOS should be shutting the computer off, not restarting it. It might be right on the edge, and/or the BIOS doesn't have the best defaults for making sure his specific CPU stays stable, but if this were the case, most likely he would notice the pattern of CPU load causing it to reboot. ;)

It's always a good idea to use something to monitor your hardware sensors, especially when you're thinking about availability with things such as a UPS. :)

chrism01 08-31-2011 07:51 PM

Can you use this http://linux.die.net/man/1/sensors (& ref to http://www.lm-sensors.nu/) ?


All times are GMT -5. The time now is 06:06 AM.