Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Distribution: Ubuntu 16.04 lts desk; Ubuntu 14.04 server
Posts: 366
Rep:
Gutsy reboots every hour!
Hi--
At 13 minutes past the hour, every hour, my Ubuntu 7.10 gutsy box reboots.
This morning, we had a power outage, a couple of them within about an hour of each other. According to syslog, the first of these caused a reboot at 8:13 am. Logs on my server from the smart-ups on that box show "line voltage notch or spike" at 9:28, 8:07, and 8:06 am.
This rebooting started at 1:13 pm, after I had been working on it since about 10:20 am.
Here is my crontab--I see nothing there that would cause this strange behavior.
Code:
# /etc/crontab: system-wide crontab
# Unlike any other crontab you don't have to run the `crontab'
# command to install the new version when you edit this file
# and files in /etc/cron.d. These files also have username fields,
# that none of the other crontabs do.
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
# m h dom mon dow user command
17 * * * * root cd / && run-parts --report /etc/cron.hourly
25 6 * * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
47 6 * * 7 root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )
52 6 1 * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly )
#
####copied from sdb1 (old drive) and refers there 20071208: did not work so commented out for now; changed to new locations 20071210:
0 * * * * root /usr/sbin/esets_update
##############ddg 20061113 updated for new directories 20071210:
0 3 * * * root /usr/sbin/esets_scan -l --mail –unsafe / -- -/dev* -/proc* -/sam* -/media/sdb1/dev* -/media/sdb1/proc* -/media/sdb1/sam*
##############
30 * * * * root cp -pru ~doug/.evolution /sam/vol22/comm/evo/
This machine is used in a production environment, so this is something I need to fix quickly.
You want to look at the end of the previous boot's syslog, and look for a crashlog, to see if there is a shutdown command. Another approach is to watch it as it does this - preferably from a terminal.
It is possible that physical damage to the system from the spike is setting up something that causes the reboot from the HW end and this is not a linux issue at all.
Last edited by Simon Bridge; 01-06-2008 at 08:51 AM.
Distribution: Ubuntu 16.04 lts desk; Ubuntu 14.04 server
Posts: 366
Original Poster
Rep:
Simon--
Many thanks for your quick reply.
There has been more strangeness, perhaps it is good.
After posting here, I shut down this box, unplugged the power and the ethernet as well as the mouse and keyboard, replugged all, restarted, and there have been no more involuntary reboots.
I did check the server and there were no other power spikes or problems reported. The whole building has a surge arrestor on the power system and it is still functioning.
Strangeness # 2. Another computer on my network lost power during this power spike and then we were unable to boot it. It stopped at or just after the Intel bootup screen (prior to grub) and reported an "error 106." We unplugged all, took it to the tech people, and they could not repeat the problem--it booted right up for them. (Of course!) So we brought it back here and it worked fine for us too. That's what gave me the idea to unplug this system.
So strangeness on strangeness.
Does this tell us there is a problem that needs looking at more? Or just let it go for now, now that all seems OK?
In case it is still relevant:
Here is the /var/crash directory--these two crashes were 3 days before this problem appeared:
After posting here, I shut down this box, unplugged the power and the ethernet as well as the mouse and keyboard, replugged all, restarted, and there have been no more involuntary reboots.
Well, there you are. Clearly the system was left in an odd state after the spike - clearing the RAM and registers has fixed it. You are lucky, you may have needed to clear the nvram too. Sometimes a power spike can damage onboard components like capacitors and resistors... once one of these goes out of tolerance, they can introduce all kinds of odd artifacts to the datastream. Accumulated small errors would cause a crash too and it is almost impossible to diagnose.
Quote:
It stopped at or just after the Intel bootup screen (prior to grub) and reported an "error 106." We unplugged all, took it to the tech people, and they could not repeat the problem--it booted right up for them.
Stopped at BIOS... looks like a register storing an odd value then.
Quote:
Jan 5 13:13:37 doug2 syslogd 1.4.1#21ubuntu3: restart.
Jan 5 14:13:54 doug2 syslogd 1.4.1#21ubuntu3: restart.
Times are not exactly the same - otherwise it doesn't really tell us much.
Without the powerdown, I'd have suggested running without that esets_update script. It's unlikely to have directly contributed but it may have used a bad register or initiated a buffer run which accumulated enough "bad stuff" in about 13mins to require a restart.
The restart itself seems quite orderly.
Hopefully this-all has convinced you to install surge protection?
(You got away with it this time, next time it could be smoke and flames!)
Distribution: Ubuntu 16.04 lts desk; Ubuntu 14.04 server
Posts: 366
Original Poster
Rep:
Guru Simon--
"smoke and flames!"
Ouch!
And thanks for the info on NVRAM--never knew there was such a thing. That's what it sounds like is a likely culprit here.
"Stopped at BIOS... looks like a register storing an odd value then." So pulling the power cleared it, yes? I had guessed it might be a bad power supply....
Oh dear ... a resonant loop in the switching PSU... I guess it's possible, but these things are pretty simple: they either go or they don't. In your case, the kernel received a "restart" where, if the power just cut out that wouldn't happen. Yank the power chord and see
Note: software does so much these days that we seldom see the hardware effects. However, witness the insight this gives.
I didn't see the flames up to now, but I smelled well the smoke!
It was in one of my job's computers. We searched during half an hour to find where is the fire, in the office locals as well as outside, and we finally remarked that a computer was down but we didn't remember to power it down!!!
It seems that some tension problem (or, perhaps, a PSU problem ? ) did kill everything in the pc: motherboard, cards, drives.
Despite this, the workstations are still working without any power protection. Just the server works on UPS.
No worries - these things can be tricky to troubleshoot. Sometimes just the act of discussing a problem can put your mind in a receptive state, so you notice possibilities that may not occur otherwise. This even when the person you're talking to dosn't actually suggest anything helpful.
Doing this in public helps everybody.
Happy hacking
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.