LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 01-05-2008, 03:53 PM   #1
dgermann
Member
 
Registered: Aug 2004
Distribution: Ubuntu 16.04 lts desk; Ubuntu 14.04 server
Posts: 366

Rep: Reputation: 31
Question Gutsy reboots every hour!


Hi--

At 13 minutes past the hour, every hour, my Ubuntu 7.10 gutsy box reboots.

This morning, we had a power outage, a couple of them within about an hour of each other. According to syslog, the first of these caused a reboot at 8:13 am. Logs on my server from the smart-ups on that box show "line voltage notch or spike" at 9:28, 8:07, and 8:06 am.

This rebooting started at 1:13 pm, after I had been working on it since about 10:20 am.

Here is my crontab--I see nothing there that would cause this strange behavior.
Code:
# /etc/crontab: system-wide crontab
# Unlike any other crontab you don't have to run the `crontab'
# command to install the new version when you edit this file
# and files in /etc/cron.d. These files also have username fields,
# that none of the other crontabs do.

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# m h dom mon dow user	command
17 *	* * *	root    cd / && run-parts --report /etc/cron.hourly
25 6	* * *	root	test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
47 6	* * 7	root	test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )
52 6	1 * *	root	test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly )
#

####copied from sdb1 (old drive) and refers there 20071208: did not work so commented out for now; changed to new locations 20071210:

0 * * * * root /usr/sbin/esets_update

##############ddg 20061113 updated for new directories 20071210:

0 3 * * * root /usr/sbin/esets_scan -l --mail –unsafe / -- -/dev* -/proc* -/sam* -/media/sdb1/dev* -/media/sdb1/proc* -/media/sdb1/sam*

##############

30 * * * *      root    cp -pru ~doug/.evolution /sam/vol22/comm/evo/
This machine is used in a production environment, so this is something I need to fix quickly.

Any ideas how to trouble shoot this, please?

Thanks!
 
Old 01-06-2008, 08:49 AM   #2
Simon Bridge
LQ Guru
 
Registered: Oct 2003
Location: Waiheke NZ
Distribution: Ubuntu
Posts: 9,211

Rep: Reputation: 198Reputation: 198
You want to look at the end of the previous boot's syslog, and look for a crashlog, to see if there is a shutdown command. Another approach is to watch it as it does this - preferably from a terminal.

It is possible that physical damage to the system from the spike is setting up something that causes the reboot from the HW end and this is not a linux issue at all.

Last edited by Simon Bridge; 01-06-2008 at 08:51 AM.
 
Old 01-06-2008, 08:25 PM   #3
dgermann
Member
 
Registered: Aug 2004
Distribution: Ubuntu 16.04 lts desk; Ubuntu 14.04 server
Posts: 366

Original Poster
Rep: Reputation: 31
Exclamation

Simon--

Many thanks for your quick reply.

There has been more strangeness, perhaps it is good.

After posting here, I shut down this box, unplugged the power and the ethernet as well as the mouse and keyboard, replugged all, restarted, and there have been no more involuntary reboots.

I did check the server and there were no other power spikes or problems reported. The whole building has a surge arrestor on the power system and it is still functioning.

Strangeness # 2. Another computer on my network lost power during this power spike and then we were unable to boot it. It stopped at or just after the Intel bootup screen (prior to grub) and reported an "error 106." We unplugged all, took it to the tech people, and they could not repeat the problem--it booted right up for them. (Of course!) So we brought it back here and it worked fine for us too. That's what gave me the idea to unplug this system.

So strangeness on strangeness.

Does this tell us there is a problem that needs looking at more? Or just let it go for now, now that all seems OK?

In case it is still relevant:

Here is the /var/crash directory--these two crashes were 3 days before this problem appeared:

Code:
drwxrwxrwt  2 root root 4.0K 2008-01-06 07:35 .
drwxr-xr-x 15 root root 4.0K 2007-12-05 21:45 ..
-rw-------  1 doug doug 1.9M 2008-01-06 21:00 _usr_bin_serpentine.1000.crash
-rw-------  1 doug doug 4.2M 2008-01-02 13:03 _usr_lib_xscreensaver_cyclone.1000.crash
Here is the first problem time in syslog (I see no crash log--maybe I am looking in the wrong place):

Code:
Jan  5 08:00:01 doug2 /USR/SBIN/CRON[27698]: (root) CMD (/usr/sbin/esets_update)
Jan  5 08:13:20 doug2 syslogd 1.4.1#21ubuntu3: restart.

Here's the time around the first reboot from syslog:

Code:
Jan  5 12:00:01 doug2 /USR/SBIN/CRON[6482]: (root) CMD (/usr/sbin/esets_update)
Jan  5 12:13:19 doug2 -- MARK --
Jan  5 12:17:01 doug2 /USR/SBIN/CRON[6521]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan  5 12:30:01 doug2 /USR/SBIN/CRON[6557]: (root) CMD (   cp -pru ~doug/.evolution /sam/vol22/comm/evo/)
Jan  5 12:53:19 doug2 -- MARK --
Jan  5 13:00:01 doug2 /USR/SBIN/CRON[6651]: (root) CMD (/usr/sbin/esets_update)
Jan  5 13:13:37 doug2 syslogd 1.4.1#21ubuntu3: restart.
Jan  5 13:13:37 doug2 kernel: Inspecting /boot/System.map-2.6.22-14-generic
Jan  5 13:13:37 doug2 kernel: Loaded 25445 symbols from /boot/System.map-2.6.22-14-generic.
Jan  5 13:13:37 doug2 kernel: Symbols match kernel version 2.6.22.
Jan  5 13:13:37 doug2 kernel: No module symbols loaded - kernel modules not enabled. 
Jan  5 13:13:37 doug2 kernel: [    0.000000] Linux version 2.6.22-14-generic (buildd@terranova) (gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)) #1 SMP Tue Dec 18 08:02:57 UTC 2007 (Ubuntu 2.6.22-14.47-generic)
Jan  5 13:13:37 doug2 kernel: [    0.000000] BIOS-provided physical RAM map:
Jan  5 13:13:37 doug2 kernel: [    0.000000]  BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
Jan  5 13:13:37 doug2 kernel: [    0.000000]  BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
Jan  5 13:13:37 doug2 kernel: [    0.000000]  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
Jan  5 13:13:37 doug2 kernel: [    0.000000]  BIOS-e820: 0000000000100000 - 000000007ed11000 (usable)
And here from the second reboot:
Code:
Jan  5 13:15:17 doug2 kernel: [  118.012000]  CIFS VFS: Send error in read = -13
Jan  5 13:15:17 doug2 kernel: [  118.012000]  CIFS VFS: Send error in read = -13
Jan  5 13:17:01 doug2 /USR/SBIN/CRON[6114]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan  5 13:30:01 doug2 /USR/SBIN/CRON[6170]: (root) CMD (   cp -pru ~doug/.evolution /sam/vol22/comm/evo/)
Jan  5 13:53:37 doug2 -- MARK --
Jan  5 14:00:01 doug2 /USR/SBIN/CRON[6229]: (root) CMD (/usr/sbin/esets_update)
Jan  5 14:13:54 doug2 syslogd 1.4.1#21ubuntu3: restart.
Jan  5 14:13:54 doug2 kernel: Inspecting /boot/System.map-2.6.22-14-generic
Jan  5 14:13:54 doug2 kernel: Loaded 25445 symbols from /boot/System.map-2.6.22-14-generic.
Jan  5 14:13:54 doug2 kernel: Symbols match kernel version 2.6.22.
Jan  5 14:13:54 doug2 kernel: No module symbols loaded - kernel modules not enabled. 
Jan  5 14:13:54 doug2 kernel: [    0.000000] Linux version 2.6.22-14-generic (buildd@terranova) (gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)) #1 SMP Tue Dec 18 08:02:57 UTC 2007 (Ubuntu 2.6.22-14.47-generic)
Jan  5 14:13:54 doug2 kernel: [    0.000000] BIOS-provided physical RAM map:
Jan  5 14:13:54 doug2 kernel: [    0.000000]  BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
Thanks, Simon!
 
Old 01-06-2008, 09:07 PM   #4
Simon Bridge
LQ Guru
 
Registered: Oct 2003
Location: Waiheke NZ
Distribution: Ubuntu
Posts: 9,211

Rep: Reputation: 198Reputation: 198
Quote:
After posting here, I shut down this box, unplugged the power and the ethernet as well as the mouse and keyboard, replugged all, restarted, and there have been no more involuntary reboots.
Well, there you are. Clearly the system was left in an odd state after the spike - clearing the RAM and registers has fixed it. You are lucky, you may have needed to clear the nvram too. Sometimes a power spike can damage onboard components like capacitors and resistors... once one of these goes out of tolerance, they can introduce all kinds of odd artifacts to the datastream. Accumulated small errors would cause a crash too and it is almost impossible to diagnose.
Quote:
It stopped at or just after the Intel bootup screen (prior to grub) and reported an "error 106." We unplugged all, took it to the tech people, and they could not repeat the problem--it booted right up for them.
Stopped at BIOS... looks like a register storing an odd value then.

Quote:
Jan 5 13:13:37 doug2 syslogd 1.4.1#21ubuntu3: restart.
Jan 5 14:13:54 doug2 syslogd 1.4.1#21ubuntu3: restart.
Times are not exactly the same - otherwise it doesn't really tell us much.

Without the powerdown, I'd have suggested running without that esets_update script. It's unlikely to have directly contributed but it may have used a bad register or initiated a buffer run which accumulated enough "bad stuff" in about 13mins to require a restart.

The restart itself seems quite orderly.

Hopefully this-all has convinced you to install surge protection?
(You got away with it this time, next time it could be smoke and flames!)
 
Old 01-06-2008, 09:17 PM   #5
dgermann
Member
 
Registered: Aug 2004
Distribution: Ubuntu 16.04 lts desk; Ubuntu 14.04 server
Posts: 366

Original Poster
Rep: Reputation: 31
Thumbs up

Guru Simon--

"smoke and flames!"

Ouch!

And thanks for the info on NVRAM--never knew there was such a thing. That's what it sounds like is a likely culprit here.

"Stopped at BIOS... looks like a register storing an odd value then." So pulling the power cleared it, yes? I had guessed it might be a bad power supply....

Thanks, Simon!
 
Old 01-06-2008, 09:31 PM   #6
Simon Bridge
LQ Guru
 
Registered: Oct 2003
Location: Waiheke NZ
Distribution: Ubuntu
Posts: 9,211

Rep: Reputation: 198Reputation: 198
Oh dear ... a resonant loop in the switching PSU... I guess it's possible, but these things are pretty simple: they either go or they don't. In your case, the kernel received a "restart" where, if the power just cut out that wouldn't happen. Yank the power chord and see

Note: software does so much these days that we seldom see the hardware effects. However, witness the insight this gives.
 
Old 01-07-2008, 08:56 AM   #7
trickykid
LQ Guru
 
Registered: Jan 2001
Posts: 24,149

Rep: Reputation: 269Reputation: 269Reputation: 269
Oh I hate when problems are fixed by JFM.. "Just F**king Magic"

I've had this JFM as a sysadmin many times and it drives me nuts at times.
 
1 members found this post helpful.
Old 01-07-2008, 09:56 AM   #8
masterclassic
Member
 
Registered: Jun 2007
Distribution: Knoppix, antiX
Posts: 252

Rep: Reputation: 73
I didn't see the flames up to now, but I smelled well the smoke!
It was in one of my job's computers. We searched during half an hour to find where is the fire, in the office locals as well as outside, and we finally remarked that a computer was down but we didn't remember to power it down!!!
It seems that some tension problem (or, perhaps, a PSU problem ? ) did kill everything in the pc: motherboard, cards, drives.

Despite this, the workstations are still working without any power protection. Just the server works on UPS.
 
Old 01-10-2008, 08:11 PM   #9
dgermann
Member
 
Registered: Aug 2004
Distribution: Ubuntu 16.04 lts desk; Ubuntu 14.04 server
Posts: 366

Original Poster
Rep: Reputation: 31
Thumbs up

Simon--

Thanks very much for all your help!

It has been running now without a shutdown for about 5 days, so it was one of those magic things that TrickyKid points out!

I want you to know that I am very thankful for your help, Simon!
 
Old 01-11-2008, 07:25 PM   #10
Simon Bridge
LQ Guru
 
Registered: Oct 2003
Location: Waiheke NZ
Distribution: Ubuntu
Posts: 9,211

Rep: Reputation: 198Reputation: 198
No worries - these things can be tricky to troubleshoot. Sometimes just the act of discussing a problem can put your mind in a receptive state, so you notice possibilities that may not occur otherwise. This even when the person you're talking to dosn't actually suggest anything helpful.

Doing this in public helps everybody.
Happy hacking
 
Old 01-12-2008, 09:42 AM   #11
dgermann
Member
 
Registered: Aug 2004
Distribution: Ubuntu 16.04 lts desk; Ubuntu 14.04 server
Posts: 366

Original Poster
Rep: Reputation: 31
Cool

Simon--

What you said is quite profound. Are you often a philosopher?

I'm going to add your reply to my favorite quotes file. Thank you!
 
Old 01-12-2008, 10:40 AM   #12
Simon Bridge
LQ Guru
 
Registered: Oct 2003
Location: Waiheke NZ
Distribution: Ubuntu
Posts: 9,211

Rep: Reputation: 198Reputation: 198
You just have to find good quality alcohol.
 
Old 01-12-2008, 10:47 AM   #13
dgermann
Member
 
Registered: Aug 2004
Distribution: Ubuntu 16.04 lts desk; Ubuntu 14.04 server
Posts: 366

Original Poster
Rep: Reputation: 31
Wink

Simon--

 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Ubuntu 7.10 (Gutsy Gibbon) Release Dates and Mark Shuttleworth About Gutsy LXer Syndicated Linux News 0 04-12-2007 04:31 PM
Chime on the hour? usaf_sp Linux - Software 2 02-24-2007 11:47 AM
how much $$$ / hour for PC repair wwnexc General 21 11-26-2005 04:17 PM
Aergh. X dies on the hour, every hour l00zer Linux - Software 4 06-07-2005 10:02 PM
change clock from 24 hour to 12 hour in suse 9.2/KDE 3.3 jmlumpkin Linux - Newbie 1 01-22-2005 11:45 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 09:37 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration