LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices



Reply
 
Search this Thread
Old 09-05-2005, 01:39 PM   #1
jordanthompson
Member
 
Registered: Oct 2004
Posts: 115

Rep: Reputation: 15
fedora server shuts down for no apparent reason


Hi there,
I am running the most recent (updated by yum) fedora release of redhat. I use this machine as print/file/web/mail/samba server. Every once and a while it just shuts itself off. I'm not sure where to look in the logs to find the answer.

thanks in advance,
Jordan
 
Old 09-05-2005, 01:48 PM   #2
macemoneta
Senior Member
 
Registered: Jan 2005
Location: Manalapan, NJ
Distribution: Fedora x86 and x86_64, Debian PPC and ARM, Android
Posts: 4,593
Blog Entries: 2

Rep: Reputation: 328Reputation: 328Reputation: 328Reputation: 328
Usually, a poweroff is caused by exceeding the thermal critical limit. There may be nothing in the logs, as Linux didn't make the call - the hardware/BIOS did.

Check to make sure all fans are running (and properly oriented - blowing in the correct direction). Monitor the temperature of the system.

Another possibility is a brief power outage. Make sure that you are on a UPS. Make sure that you are not exceeding the power draw on your power supply. If your server has redundant power supplies, make sure that you are using independant power sources for each.
 
Old 09-05-2005, 03:12 PM   #3
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,154

Rep: Reputation: 333Reputation: 333Reputation: 333Reputation: 333
You might want to look at sensors and set a few alarms for high temp conditions. The youu'd at least know if that was the problem.

Personally, I think power problems are more likely.

You did look at dmesg just to see what it said, didn't you?

I like
Code:
$ dmesg | gawk '/fail/;/error/'
If you wanted to look at all your logs, here's one way:
Code:
# gawk '/fail/{print FILENAME ": " $0};/error/{print FILENAME ": " $0}' `ls -D /var/log/*`
although there are los of monitoring tools available.

Take a look here for some.
 
Old 09-05-2005, 03:58 PM   #4
jordanthompson
Member
 
Registered: Oct 2004
Posts: 115

Original Poster
Rep: Reputation: 15
Here are the results for
gawk '/fail/{print FILENAME ": " $0};/error/{print FILENAME ": " $0}' `ls -D /var/log/*` | grep "Sep 5"

I'm guessing it died around 2 hours ago (right after the first post in this thread.)


/var/log/boot.log: Sep 5 13:28:09 dot mdmpd: mdmpd failed
/var/log/maillog: Sep 5 00:49:15 dot imap[31858]: SQUAT failed to open index file
/var/log/maillog: Sep 5 00:49:15 dot imap[31858]: SQUAT failed
/var/log/maillog: Sep 5 00:49:45 dot imap[31928]: SQUAT failed to open index file
/var/log/maillog: Sep 5 00:49:45 dot imap[31928]: SQUAT failed
/var/log/maillog: Sep 5 00:50:08 dot imap[31858]: SQUAT failed to open index file
/var/log/maillog: Sep 5 00:50:08 dot imap[31858]: SQUAT failed
/var/log/maillog: Sep 5 09:10:24 dot imap[32247]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:10:24 dot imap[32247]: SQUAT failed
/var/log/maillog: Sep 5 09:10:54 dot imap[32249]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:10:54 dot imap[32249]: SQUAT failed
/var/log/maillog: Sep 5 09:11:34 dot imap[32248]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:11:34 dot imap[32248]: SQUAT failed
/var/log/maillog: Sep 5 09:15:26 dot imap[1677]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:15:26 dot imap[1677]: SQUAT failed
/var/log/maillog: Sep 5 09:36:06 dot imap[1784]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:36:06 dot imap[1784]: SQUAT failed
/var/log/maillog: Sep 5 09:37:44 dot imap[1790]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:37:44 dot imap[1790]: SQUAT failed
/var/log/maillog: Sep 5 09:38:02 dot imap[1786]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:38:02 dot imap[1786]: SQUAT failed
/var/log/maillog: Sep 5 09:38:10 dot imap[1787]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:38:10 dot imap[1787]: SQUAT failed
/var/log/maillog: Sep 5 09:38:14 dot imap[1790]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:38:14 dot imap[1790]: SQUAT failed
/var/log/maillog: Sep 5 09:38:49 dot imap[1786]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:38:49 dot imap[1786]: SQUAT failed
/var/log/maillog: Sep 5 09:38:55 dot imap[1788]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:38:55 dot imap[1788]: SQUAT failed
/var/log/maillog: Sep 5 09:38:59 dot imap[1791]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:38:59 dot imap[1791]: SQUAT failed
/var/log/maillog: Sep 5 09:39:34 dot imap[1789]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:39:34 dot imap[1789]: SQUAT failed
/var/log/maillog: Sep 5 09:50:34 dot imap[1794]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:50:34 dot imap[1794]: SQUAT failed
/var/log/maillog: Sep 5 09:50:39 dot imap[1795]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:50:39 dot imap[1795]: SQUAT failed
/var/log/maillog: Sep 5 09:53:49 dot imap[1828]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:53:49 dot imap[1828]: SQUAT failed
/var/log/maillog: Sep 5 09:54:09 dot imap[1833]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:54:09 dot imap[1833]: SQUAT failed
/var/log/maillog: Sep 5 09:55:35 dot imap[1838]: SQUAT failed to open index file
/var/log/maillog: Sep 5 09:55:35 dot imap[1838]: SQUAT failed
/var/log/maillog: Sep 5 10:03:29 dot imap[1841]: SQUAT failed to open index file
/var/log/maillog: Sep 5 10:03:29 dot imap[1841]: SQUAT failed
/var/log/maillog: Sep 5 10:03:34 dot imap[1843]: SQUAT failed to open index file
/var/log/maillog: Sep 5 10:03:34 dot imap[1843]: SQUAT failed
/var/log/maillog: Sep 5 10:04:03 dot imap[1881]: SQUAT failed to open index file
/var/log/maillog: Sep 5 10:04:03 dot imap[1881]: SQUAT failed
/var/log/maillog: Sep 5 10:04:18 dot imap[1841]: SQUAT failed to open index file
/var/log/maillog: Sep 5 10:04:18 dot imap[1841]: SQUAT failed
/var/log/maillog: Sep 5 10:04:45 dot imap[1842]: SQUAT failed to open index file
/var/log/maillog: Sep 5 10:04:45 dot imap[1842]: SQUAT failed
/var/log/maillog: Sep 5 10:16:30 dot imap[1883]: SQUAT failed to open index file
/var/log/maillog: Sep 5 10:16:30 dot imap[1883]: SQUAT failed
/var/log/maillog: Sep 5 10:19:48 dot imap[1885]: SQUAT failed to open index file
/var/log/maillog: Sep 5 10:19:48 dot imap[1885]: SQUAT failed
/var/log/maillog: Sep 5 10:20:22 dot imap[1886]: SQUAT failed to open index file
/var/log/maillog: Sep 5 10:20:22 dot imap[1886]: SQUAT failed
/var/log/maillog: Sep 5 10:23:44 dot imap[1888]: SQUAT failed to open index file
/var/log/maillog: Sep 5 10:23:44 dot imap[1888]: SQUAT failed
/var/log/maillog: Sep 5 15:44:17 dot imap[4189]: SQUAT failed to open index file
/var/log/maillog: Sep 5 15:44:17 dot imap[4189]: SQUAT failed
/var/log/maillog: Sep 5 15:44:28 dot imap[4256]: SQUAT failed to open index file
/var/log/maillog: Sep 5 15:44:28 dot imap[4256]: SQUAT failed
/var/log/maillog: Sep 5 15:45:36 dot imap[5434]: SQUAT failed to open index file
/var/log/maillog: Sep 5 15:45:36 dot imap[5434]: SQUAT failed
/var/log/messages: Sep 5 09:28:30 dot smbd[1779]: getpeername failed. Error was Transport endpoint is not connected
/var/log/messages: Sep 5 09:28:30 dot smbd[1779]: write_socket_data: write failure. Error = Connection reset by peer
/var/log/messages: Sep 5 13:27:38 dot kernel: ** driver failed to call pci_enable_device(). As a temporary
/var/log/messages: Sep 5 13:28:09 dot mdmpd: mdmpd failed
/var/log/messages: Sep 5 13:33:03 dot smbd[5234]: getpeername failed. Error was Transport endpoint is not connected
/var/log/messages: Sep 5 13:33:03 dot smbd[5234]: getpeername failed. Error was Transport endpoint is not connected
/var/log/messages: Sep 5 13:33:03 dot smbd[5234]: write_socket_data: write failure. Error = Connection reset by peer
/var/log/secure: Sep 5 13:27:48 dot sshd[4023]: error: Bind to port 22 on 0.0.0.0 failed: Address already in use.
/var/log/secure: Sep 5 13:27:48 dot sshd[4023]: error: Bind to port 22 on 0.0.0.0 failed: Address already in use.
[root@dot jordan]#
 
Old 09-05-2005, 04:19 PM   #5
jordanthompson
Member
 
Registered: Oct 2004
Posts: 115

Original Poster
Rep: Reputation: 15
By the way, I don't think its a heat issue - I had just checked the operation of all of the fans (I do that periodically) and I have an extra one on the chassis itself to boot.
 
Old 09-08-2005, 09:20 PM   #6
jordanthompson
Member
 
Registered: Oct 2004
Posts: 115

Original Poster
Rep: Reputation: 15
Any suggestions where to look?
 
Old 09-09-2005, 12:06 AM   #7
macemoneta
Senior Member
 
Registered: Jan 2005
Location: Manalapan, NJ
Distribution: Fedora x86 and x86_64, Debian PPC and ARM, Android
Posts: 4,593
Blog Entries: 2

Rep: Reputation: 328Reputation: 328Reputation: 328Reputation: 328
Well, have you set up the lmsensors as PTrenholme suggested, to check the temperature readings before a shutdown? Have you verified that you are not overdrawing your power supply? Do you have a UPS? Also, are you overclocking? Are you using an aftermarket heatsink on the CPU?

Try this; open a command window and enter:

while true; do true; done

That's an infinite loop (you can interrupt it with Ctrl-c). It will cause your CPU temperature to go up quickly. If your system shuts down within about 5 minutes of starting that, it's a heat issue. If not, it's likely a power issue.
 
Old 09-09-2005, 11:07 PM   #8
jordanthompson
Member
 
Registered: Oct 2004
Posts: 115

Original Poster
Rep: Reputation: 15
Well, have you set up the lmsensors as PTrenholme suggested, to check the temperature readings before a shutdown?
I could not get this to work (compile, install, etc.) I was able to install the rpm, but I can't find where it put the binaries.

Have you verified that you are not overdrawing your power supply?
I am definetly not overdrawing the ps. I have one card - everything else is onboard the motherboard.

Do you have a UPS?
Actually, I have two - I live in Florida :-)

Also, are you overclocking?
No

Are you using an aftermarket heatsink on the CPU?
No - it is an Intel - at least it came with the CPU.

Try this; open a command window and enter:
while true; do true; done
Did this - it has been running now for over an hour - the computer is still up.

Any other suggestions? Where in the logs could I find a clue - if the OS is shutting it down for some reason?
Thanks very much for your help,
Jordan
 
Old 09-10-2005, 04:22 AM   #9
macemoneta
Senior Member
 
Registered: Jan 2005
Location: Manalapan, NJ
Distribution: Fedora x86 and x86_64, Debian PPC and ARM, Android
Posts: 4,593
Blog Entries: 2

Rep: Reputation: 328Reputation: 328Reputation: 328Reputation: 328
If the OS is shutting it down (instead of a spontaneous power off), edit /var/log/messages after booting back up. Go to the bottom of the file, then search backwards for "restart" - the syslogd restart message. The messages before this are the last messages recorded by the system before the poweroff. If this was a normal system initiated operation, you will see a series of service termination messages. For example:

Code:
Aug  8 01:22:41 mmouse shutdown: shutting down for system reboot
Aug  8 01:22:43 mmouse init: Switching to runlevel: 6
Aug  8 01:22:44 mmouse cups-config-daemon: cups-config-daemon -TERM succeeded
Aug  8 01:22:45 mmouse dbus: avc:  1 AV entries and 1/512 buckets used, longest chain length 1
Aug  8 01:22:45 mmouse messagebus: messagebus -TERM succeeded
Aug  8 01:22:45 mmouse cups: cupsd shutdown succeeded
Aug  8 01:22:50 mmouse httpd: httpd shutdown succeeded
Aug  8 01:22:50 mmouse sshd: sshd -TERM succeeded
Aug  8 01:22:51 mmouse sendmail: sendmail shutdown succeeded
Aug  8 01:22:51 mmouse sendmail: sm-client shutdown succeeded
Aug  8 01:22:51 mmouse spamassassin: spamd shutdown succeeded
Aug  8 01:22:52 mmouse dhcpd: dhcpd shutdown succeeded
Aug  8 01:22:52 mmouse dhcpd: dhcpd shutdown succeeded
Aug  8 01:22:52 mmouse smartd[3966]: smartd received signal 15: Terminated
Aug  8 01:22:52 mmouse smartd[3966]: smartd is exiting (exit status 0)
Aug  8 01:22:53 mmouse smartd: smartd shutdown succeeded
Aug  8 01:22:53 mmouse xinetd[4084]: Exiting...
Aug  8 01:22:53 mmouse xinetd: xinetd shutdown succeeded
Aug  8 01:22:54 mmouse acpid: acpid shutdown succeeded
Aug  8 01:22:55 mmouse crond: crond shutdown succeeded
Aug  8 01:22:55 mmouse ntpd[4104]: ntpd exiting on signal 15
Aug  8 01:22:55 mmouse mdmonitor: mdadm shutdown succeeded
Aug  8 01:22:55 mmouse kernel: Kernel logging (proc) stopped.
Aug  8 01:22:55 mmouse kernel: Kernel log daemon terminating.
Aug  8 01:22:56 mmouse syslog: klogd shutdown succeeded
Aug  8 01:22:56 mmouse exiting on signal 15
 
Old 09-10-2005, 02:54 PM   #10
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,154

Rep: Reputation: 333Reputation: 333Reputation: 333Reputation: 333
I really don't think you need it, since over-temp problems are not too likely, and your BIOS should be monitoring the temp.s for you, but, if you'd like to see the temperatures, the lm_sensors package is in, I believe, the FC4 "base" repository, so just do a
Code:
# yum install lm_sensors
(The version on ATrpms is more current, if you care.)

If you use KDE, try a
Code:
# yum install kdeutils
instad (which should install lm_sensors as a dependency), and the use the "KSensors" applet to display the temperatures on your panel, and set alarms. (If you don't see any hardware temperature sensors, look at info sensors and info sensors.conf.)

By the way, my system uses a 3GHz Intel 745 H/T, which was running hot until I replaced the stock cooler with an aftermarket one. Now I'm seldom above 110 F.

Caution: Replacing the heat-sink is not easy for the inexperienced, and experience can be expensive to acquire. It cost me $200 for a new CPU when I bent the pins removing the old CPU/heat-sink. (I thought I could remove the heat-sink leaving the CPU in the M/B, but the thermal grease had a different idea, and prevailed.)
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
apache server shuts down for unknown reason jordanthompson Linux - Software 3 01-18-2005 10:15 PM
Eth0 interface dead... for no apparent reason in suse 9.1. any ideas? GD_19 Linux - Networking 1 10-16-2004 03:19 AM
Fedora Core II locking for no apparent reason k41184 Linux - Software 1 09-11-2004 12:47 PM
xorg hanging for no apparent reason Covel Linux - Software 6 06-06-2004 04:59 AM
X drops back to 640x480 for no apparent reason Griffon26 Linux - Software 2 06-24-2002 01:55 PM


All times are GMT -5. The time now is 09:00 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration