S5520UR Reboots unexpectedly - Nothing on the logs
Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
If the logs show nothing, increase your logging level.
Have you a watchdog enabled?
Is ACPI or anything configured to reboot on any condition? You can eliminate ACPI by running it with '-l' which throws events into syslog.
If the logs show nothing, increase your logging level.
Have you a watchdog enabled?
Is ACPI or anything configured to reboot on any condition? You can eliminate ACPI by running it with '-l' which throws events into syslog.
Hello business_kid and thanks for your quick answer !
There isn't a watchdog installed.
I'll see if I can increase the level of logging ...
Hello business_kid and thanks for your quick answer !
There isn't a watchdog installed.
I'll see if I can increase the level of logging ...
Thanks !
The only errors I've found are these in syslog:
kernel: [601468.885157] Uhhuh. NMI received for unknown reason 31 on CPU 1.
kernel: [601468.885648] Do you have a strange power saving mode enabled?
kernel: [601468.886129] Dazed and confused, but trying to continue
But the dont happend just before the reboot.
There's still nothing at the moment of the reboot...
Do you think that these error messages could be related to the unexpected reboots ?
kernel: [601468.886129] Dazed and confused, but trying to continue
That's a kernel error Google that.
NMI. not being received is serious anda possible indicator of trouble (software OR hardware). The 'NM' in that stands for "Non Maskable." If that's not getting through, it sounds like one CPU goes AWOL.
If the logs show nothing, increase your logging level.
Have you a watchdog enabled?
Is ACPI or anything configured to reboot on any condition? You can eliminate ACPI by running it with '-l' which throws events into syslog.
Hi, I've just found out that there is a watcdog enabled, found these on the kernel log :
kern.log.1:Mar 14 17:27:25 SUPERPUMA kernel: [ 0.323307] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
kern.log.1:Mar 18 20:30:12 SUPERPUMA kernel: [ 0.323312] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
A watchdog watches for processes going out to lunch, and resets the box. This is usually done with a timer counting down. Every time you get back to the basic routine, it loads up the timer. If one of your processes goes out to lunch, your core goes with it, so the watchdog runs out and you reset.
One thing you could do is disable it. After bootup, this might do it.
A watchdog watches for processes going out to lunch, and resets the box. This is usually done with a timer counting down. Every time you get back to the basic routine, it loads up the timer. If one of your processes goes out to lunch, your core goes with it, so the watchdog runs out and you reset.
One thing you could do is disable it. After bootup, this might do it.
Code:
echo 0 > /proc/sys/kernel/nmi_watchdog
Thanks for the answer !
I'm currently testing the system with the ACPI off, if I get a reboot I'll disable de watchdog also...
Do you think these log messages are connected with the reboots ?
/var/log/kern.log.1:Mar 18 20:30:12 SUPERPUMA kernel: [ 27.867103] ACPI Warning: SystemIO range 0x0000000000000428-0x000000000000042F conflicts with OpRegion 0x0000000000000428-0x000000000000042F (\GPE0) (20140424/utaddress-254)
/var/log/kern.log.1:Mar 18 20:30:12 SUPERPUMA kernel: [ 27.867109] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
/var/log/kern.log.1:Mar 18 20:30:12 SUPERPUMA kernel: [ 27.867112] ACPI Warning: SystemIO range 0x0000000000000500-0x000000000000052F conflicts with OpRegion 0x0000000000000500-0x000000000000052F (\_SI_.SIOR) (20140424/utaddress-254)
/var/log/kern.log.1:Mar 18 20:30:12 SUPERPUMA kernel: [ 27.867114] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
/var/log/kern.log.1:Mar 18 20:30:12 SUPERPUMA kernel: [ 0.648897] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
I woul;d turn up logging and scour the log from 5 - 6 to see if anyone attacks, mail or ftp gets busy, whatever. It could be idle, start on disk maintenance, and puke on that. Find out.
I woul;d turn up logging and scour the log from 5 - 6 to see if anyone attacks, mail or ftp gets busy, whatever. It could be idle, start on disk maintenance, and puke on that. Find out.
Thanks !
I'll try to compare the monday syslog with the tuesday one to see if I can find something odd...
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.