LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 11-04-2010, 12:15 PM   #1
someshpr
Member
 
Registered: Jul 2009
Location: CA, USA
Distribution: Debian, RHEL5.4, CentOS 5.4, 6.2, Ubuntu 11.04,12.04
Posts: 107

Rep: Reputation: 26
System rebooting on its own ... where to check for reasons?


Hi,
I have NFS fileserver that has served me well for more than year. But recently I noticed that it has started to reboot on its own very frequently, almost once a day! It is most likely not a power related issue as I tried changing UPS/power sources, but no help!

So my question is:
Is there any log file where I can check which is causing the reboot? There may not be a single logfile, but I need some point to start the investigation!

OS: CentOS 5.2

TIA,

Last edited by someshpr; 11-04-2010 at 12:19 PM.
 
Old 11-04-2010, 12:36 PM   #2
ckoniecny
Member
 
Registered: Oct 2005
Posts: 162

Rep: Reputation: 30
First place I would look is the syslog. I'm not too familiar with CentOS, but the majority of Linux distributions keep all logs in /var/logs. That's the first place I would look for additional details. The details in here ought to give you the processes that were running before the system crashed.
 
Old 11-04-2010, 02:54 PM   #3
someshpr
Member
 
Registered: Jul 2009
Location: CA, USA
Distribution: Debian, RHEL5.4, CentOS 5.4, 6.2, Ubuntu 11.04,12.04
Posts: 107

Original Poster
Rep: Reputation: 26
Quote:
Originally Posted by ckoniecny View Post
First place I would look is the syslog. I'm not too familiar with CentOS, but the majority of Linux distributions keep all logs in /var/logs. That's the first place I would look for additional details. The details in here ought to give you the processes that were running before the system crashed.
Thanks ckoniecny,
I checked the /var/logs directory. There are lots of log files. However, the ones that last changed during last reboot (Nov,4 at 10:26) are dmesg, acpid and apcupsd.events.
acpid has this at its end:
Code:
[Wed Nov  3 11:33:41 2010] 1 rule loaded
[Wed Nov  3 18:02:54 2010] exiting
[Wed Nov  3 18:07:48 2010] starting up
[Wed Nov  3 18:07:48 2010] 1 rule loaded
[Thu Nov  4 10:26:23 2010] starting up
[Thu Nov  4 10:26:23 2010] 1 rule loaded
apcupsd.events seems to point to the fact that ups is not connected via data cable to the server.
dmesg has lot of things and I cannot say which entries are realted to the reboot. I'll look into each entry to figure out.

Logs that have been modified after last reboot are: wtmp, btmp, secure, lastlog, cron and messages. btmp and wtmp are binary file and I couldn't read them. secure seem to contain entries of ssh connections made to the machine. cron also has nothing (no cronjob set). messages has this around the timestamp of reboot:
Code:
Nov  4 10:07:02 cretaceous mountd[2486]: authenticated unmount request from archean.researchdomain.edu:620 for /export/share (/export/share)
Nov  4 10:07:02 cretaceous mountd[2486]: authenticated unmount request from archean.researchdomain.edu:623 for /export/depot (/export/depot)
Nov  4 10:16:38 cretaceous mountd[2486]: authenticated mount request from archean.researchdomain.edu:997 for /export/share (/export/share)
Nov  4 10:16:38 cretaceous kernel: svc: unknown version (3)
Nov  4 10:22:04 cretaceous mountd[2486]: authenticated unmount request from archean.researchdomain.edu:614 for /export/share (/export/share)
Nov  4 10:22:11 cretaceous mountd[2486]: authenticated mount request from cambrian.researchdomain.edu:810 for /export/home (/export/home)
Nov  4 10:22:11 cretaceous kernel: svc: unknown version (3)
Nov  4 10:26:21 cretaceous syslogd 1.4.1: restart.
Nov  4 10:26:21 cretaceous kernel: klogd 1.4.1, log source = /proc/kmsg started.
Nov  4 10:26:21 cretaceous kernel: Linux version 2.6.18-53.1.21.el5 (mockbuild@builder10.centos.org) (gcc version 4.1.2 20070626 (Red Hat 4.1.2-14)) #1
 SMP Tue May 20 09:35:07 EDT 2008
Nov  4 10:26:21 cretaceous kernel: Command line: ro root=LABEL=/
Nov  4 10:26:21 cretaceous kernel: BIOS-provided physical RAM map:
Nov  4 10:26:21 cretaceous kernel:  BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
Nov  4 10:26:21 cretaceous kernel:  BIOS-e820: 0000000000100000 - 000000007fb50000 (usable)
Nov  4 10:26:21 cretaceous kernel:  BIOS-e820: 000000007fb50000 - 000000007fb66000 (reserved)
Nov  4 10:26:21 cretaceous kernel:  BIOS-e820: 000000007fb66000 - 000000007fb85c00 (ACPI data)
Nov  4 10:26:21 cretaceous kernel:  BIOS-e820: 000000007fb85c00 - 0000000080000000 (reserved)
Nov  4 10:26:21 cretaceous kernel:  BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
Nov  4 10:26:21 cretaceous kernel:  BIOS-e820: 00000000fe000000 - 0000000100000000 (reserved)
Nov  4 10:26:21 cretaceous kernel: DMI 2.4 present.
Nov  4 10:26:21 cretaceous kernel: No NUMA configuration found
Nov  4 10:26:21 cretaceous kernel: Faking a node at 0000000000000000-000000007fb50000
Nov  4 10:26:21 cretaceous kernel: Bootmem setup node 0 0000000000000000-000000007fb50000
Nov  4 10:26:21 cretaceous kernel: Memory for crash kernel (0x0 to 0x0) notwithin permissible range
Nov  4 10:26:21 cretaceous kernel: disabling kdump
Nov  4 10:26:21 cretaceous kernel: ACPI: PM-Timer IO Port: 0x808
Nov  4 10:26:21 cretaceous kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Nov  4 10:26:21 cretaceous kernel: Processor #0 6:15 APIC version 20
Nov  4 10:26:21 cretaceous kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Nov  4 10:26:21 cretaceous kernel: Processor #1 6:15 APIC version 20
Nov  4 10:26:21 cretaceous kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x12] disabled)
Nov  4 10:26:21 cretaceous kernel: ACPI: LAPIC (acpi_id[0x04] lapic_id[0x13] disabled)
Nov  4 10:26:21 cretaceous kernel: ACPI: LAPIC (acpi_id[0x05] lapic_id[0x14] disabled)
Nov  4 10:26:21 cretaceous kernel: ACPI: LAPIC (acpi_id[0x06] lapic_id[0x15] disabled)
Nov  4 10:26:21 cretaceous kernel: ACPI: LAPIC (acpi_id[0x07] lapic_id[0x16] disabled)
Nov  4 10:26:21 cretaceous kernel: ACPI: LAPIC (acpi_id[0x08] lapic_id[0x17] disabled)
Nov  4 10:26:21 cretaceous kernel: ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
Nov  4 10:26:21 cretaceous kernel: ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
Nov  4 10:26:21 cretaceous kernel: IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
Nov  4 10:26:21 cretaceous kernel: ACPI: IOAPIC (id[0x03] address[0xfec80000] gsi_base[32])
Nov  4 10:26:21 cretaceous kernel: IOAPIC[1]: apic_id 3, version 32, address 0xfec80000, GSI 32-55
Nov  4 10:26:21 cretaceous kernel: ACPI: IOAPIC (id[0x04] address[0xfec83000] gsi_base[128])
Nov  4 10:26:21 cretaceous kernel: IOAPIC[2]: apic_id 4, version 32, address 0xfec83000, GSI 128-151
Nov  4 10:26:21 cretaceous kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
Nov  4 10:26:21 cretaceous kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Nov  4 10:26:21 cretaceous kernel: Setting APIC routing to physical flat
Nov  4 10:26:21 cretaceous kernel: ACPI: HPET id: 0x8086a201 base: 0xfed00000
My first reaction looking at this message log was that may be "svc: unknown version (3)" is the clue. But then I saw that there are multiple instances of this warning but not all the time the machine has restarted!
So I am still clueless. Any more pointer where to look for or how to interprete the logs will be really appreciated!

TIA,
 
Old 03-15-2011, 12:04 PM   #4
someshpr
Member
 
Registered: Jul 2009
Location: CA, USA
Distribution: Debian, RHEL5.4, CentOS 5.4, 6.2, Ubuntu 11.04,12.04
Posts: 107

Original Poster
Rep: Reputation: 26
It seems a code requiring huge memory was the cause. My guess is that this code was using up all memory forcing the NFS server to reboot. Its a guess, as after we stopped running this code, the server hasn't rebooted automatically!
This code caused one other system to overload and eventually hard disk crashed when a hard reboot was forced manually. That's how we sort of homed in on this particular code.

Thanks,
 
  


Reply

Tags
logfile, rebooting


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: More Reasons Why Chrome OS Will Be Your Extra Operating System LXer Syndicated Linux News 1 03-13-2010 03:09 AM
LXer: 3 Reasons Why Your System Might Be Slow LXer Syndicated Linux News 0 06-10-2008 10:40 AM
problems after rebooting the system!! sharmashikha Linux From Scratch 2 07-09-2005 02:29 AM
Help, system is rebooting nightly.. BxBoy Linux - General 3 12-18-2003 03:21 PM
Self Rebooting System hubergeek Linux - General 2 06-11-2003 08:36 PM


All times are GMT -5. The time now is 12:19 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration