Cent OS 3.x Random System Failure, very odd please advise
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
here is the problem randomly usually over a period of a few days i'll come to log into the system and notice the power LED still on however the screen is black and i cannot SSH or do anything but hold the power button to restart it. I know linux is a very stable OS so i am having a hard time believing that its a software issue. I am more inclined to say its hardware related however i have not been able to track it down. I have disabled most of the power management features i am aware of because at first i thought that was the issue. Anyways, any help would be much appreciated thank you.
Check /var/log/messages for errors just befor the failure (you'll see the syslog restart message from your last boot, then look up in the file).
This is typically a heat or power problem. Once, I've also seen a flacky PCI card cause this.
Some likely causes:
- failed or disconnected fan causing an overheat
- excessive dust accumulation
- failing power supply
- power spikes - a high current device on the same electrical circuit, like a laser printer, vacuum, failing UPS or surge suppressor, etc.
- induced power spikes - lightning, power lines "slapping" in the wind or in trees, etc.
My system log didnt show any odd errors however it does restart every hour it looks like.
Nov 30 17:00:06 alice syslogd 1.4.1: restart.
Nov 30 18:00:04 alice syslogd 1.4.1: restart.
Nov 30 19:00:04 alice syslogd 1.4.1: restart.
Nov 30 20:00:05 alice syslogd 1.4.1: restart.
Dec 1 09:01:09 alice syslogd 1.4.1: restart.
Dec 1 09:01:09 alice syslog: syslogd startup succeeded
it stopped working nov 30 until i had to hard reboot it today dec 1
That's too precise for a failure. You likely have a cron job that is shutting down your system. If this is a work system, do you have any (ex)disgruntled employees? If it is a home system, someone else has likely gotten access to your system (do you have ftp, rsh, telnet or other insecure access enabled?).
It could also be a hardware problem that is triggering the issue, initiated by a cron job. For example, a flaky USB/Firewire PCI card that is used for an hourly file copy to an external drive, or something similar. When the cron job starts, the system fails.
alright i added the kernel options acpi=off apm=off. i am gonna see if that works. I am also gonna do a fresh install of CentOS 4.4 on a new machine to see if the issue goes away
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.