Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux? |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
10-25-2021, 04:20 AM
|
#1
|
LQ Newbie
Registered: May 2018
Location: Austria
Distribution: EndeavourOS
Posts: 23
Rep: 
|
Debian: where to look in logs for the culprit of random restarts/freezes
Hi there! I have a AMD based machine which is experiencing random restarts/freezes and I do not know how I can identify the culprit.
Machine is running debian 10 and open media vault 5. It is an AMD based machine. Until now I have tried with different MoBo, CPU, CPU cooler, PSU but the random restarts are persisting.
CPUs are AMD Ryzen 5 1600 and AMD Ryzen 7 1700 on 2 B450I MoBos one from Asus and one from Gigabyte with Noctua and stock CPU cooler. PSUs are one Silverstone Flex ATX (miserable, but fitting a micro-ITX case) and a BeQuiet Pure Power 11 400W (which should be decent at least).
What I have not changed (so far) is the RAM (memtest with 10 passthrough runs now in the background), the SSD (WD Blue M.2 Nvme which is new, but one never know) and the OS.
My question is if I can dig somewhere in the logs to find the culprit of these restarts.
Thanks!
|
|
|
10-25-2021, 10:21 AM
|
#2
|
Senior Member
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 4,294
|
One problem with crashes is that they may prevent logging. Check "last" to see what it says for the failed session. If there is no stack trace shown on the console, and the system just restarts, then I would suspect power issues. Mine will show two "still running" lines in the event of a sudden power outage, like:
Code:
reboot system boot 5.14.12-200.fc34 Mon Oct 25 10:14 still running
reboot system boot 5.14.12-200.fc34 Mon Oct 25 10:11 still running
reboot system boot 5.14.12-200.fc34 Thu Oct 21 13:02 - 15:38 (02:35)
No reason is shown for the 10:11 boot this morning because I accidentally disconnected power after starting the system. The Oct 21 session shows the line for a normal shutdown.
|
|
|
10-26-2021, 12:10 AM
|
#3
|
LQ Addict
Registered: Dec 2013
Posts: 19,872
|
Quote:
Originally Posted by beje
My question is if I can dig somewhere in the logs to find the culprit of these restarts.
|
First you will probably have to enable persistent logging, then, after a freeze/restart, issue
|
|
|
10-29-2021, 03:27 AM
|
#4
|
LQ Newbie
Registered: May 2018
Location: Austria
Distribution: EndeavourOS
Posts: 23
Original Poster
Rep: 
|
Quote:
Originally Posted by smallpond
One problem with crashes is that they may prevent logging. Check "last" to see what it says for the failed session. If there is no stack trace shown on the console, and the system just restarts, then I would suspect power issues. Mine will show two "still running" lines in the event of a sudden power outage, like:
Code:
reboot system boot 5.14.12-200.fc34 Mon Oct 25 10:14 still running
reboot system boot 5.14.12-200.fc34 Mon Oct 25 10:11 still running
reboot system boot 5.14.12-200.fc34 Thu Oct 21 13:02 - 15:38 (02:35)
No reason is shown for the 10:11 boot this morning because I accidentally disconnected power after starting the system. The Oct 21 session shows the line for a normal shutdown.
|
Code:
root pts/2 10.0.0.29 Fri Oct 29 10:03 still logged in
root pts/1 10.0.0.29 Fri Oct 29 10:01 still logged in
root pts/0 10.0.0.16 Fri Oct 29 08:15 still logged in
reboot system boot 4.19.0-18-amd64 Thu Oct 28 20:41 still running
root pts/0 10.0.0.16 Mon Oct 25 19:34 - 21:45 (02:11)
reboot system boot 4.19.0-18-amd64 Mon Oct 25 19:33 still running
root pts/0 10.0.0.16 Mon Oct 25 19:26 - down (00:06)
root tty1 Mon Oct 25 19:23 - down (00:10)
reboot system boot 4.19.0-18-amd64 Mon Oct 25 19:23 - 19:33 (00:10)
reboot system boot 4.19.0-18-amd64 Mon Oct 25 19:13 - 19:15 (00:02)
It seems that I have the same: the one from 19:33 is the faulty one and system restarted on 28 Oct 20:41... should I suspect then power issues? on the PSU or from the grid?
Quote:
Originally Posted by ondoho
First you will probably have to enable persistent logging, then, after a freeze/restart, issue
|
I have enable it now, and I will have a look after next restart...
Thank you both.
|
|
|
10-30-2021, 09:52 PM
|
#5
|
LQ Guru
Registered: Aug 2016
Location: SE USA
Distribution: openSUSE 24/7; Debian, Knoppix, Mageia, Fedora, OS/2, others
Posts: 6,425
|
Those power supply brands don't sound top notch quality. Is either under warranty? If not, open up to inspect for leaky or swollen electrolytic capacitors. Visit http://badcaps.net/ to know what to look for. If you find any, you have your answer. Power supplies are notorious sources of random lockups.
|
|
|
10-31-2021, 07:31 AM
|
#6
|
LQ Newbie
Registered: May 2018
Location: Austria
Distribution: EndeavourOS
Posts: 23
Original Poster
Rep: 
|
Quote:
Originally Posted by mrmazda
Those power supply brands don't sound top notch quality. Is either under warranty? If not, open up to inspect for leaky or swollen electrolytic capacitors. Visit http://badcaps.net/ to know what to look for. If you find any, you have your answer. Power supplies are notorious sources of random lockups.
|
Silverstone is an entry level PSU but was the only Flex-ATX that I have found on the market. Be Quiet! is a pretty well established brand here in Europe. Both of them are new and under warranty.
|
|
|
11-12-2021, 11:51 PM
|
#7
|
Member
Registered: Sep 2015
Distribution: MX Linux 21.3 Xfce
Posts: 596
Rep: 
|
@beje ,the problem with looking at logs because of hard freezes and reboots is when what ever is the cause of it will not show in the logs because all writing stops. I have experienced the same issues you are experiencing. I tried memory tests, replaced the power supply, replaced the video card. This all started after 11 months of my computer build. After another two months of trying to figure out what was causing it the culprit showed itself when my computer would no longer boot up. In my case it turned out to be the motherboard failed which ASUS only had a 12 month warranty. Apparently ASUS quality is not what it once was. I replaced it with a MSI motherboard that ended up dying after two months. I now have a Gigabyte motherboard that has been great so far which is 1 year and 8 months. I would recommend you check in the BIOS for the CPU settings so it doesn't idle because that setting can cause the exact problem you are experiencing with Ryzen CPUs if they idle. Also, make sure you are running Linux kernel 5.3 or higher because earlier kernels can cause those problems with Ryzen CPUs.
|
|
|
11-13-2021, 04:43 AM
|
#8
|
LQ Guru
Registered: Aug 2016
Location: SE USA
Distribution: openSUSE 24/7; Debian, Knoppix, Mageia, Fedora, OS/2, others
Posts: 6,425
|
If using DDR4 RAM I recommend thorough/overnight testing with memtest86, not memtest86+.
|
|
|
11-15-2021, 02:33 AM
|
#9
|
LQ Newbie
Registered: May 2018
Location: Austria
Distribution: EndeavourOS
Posts: 23
Original Poster
Rep: 
|
I have done it: 10 cycles and nothing :-)
The last suspect is a (cheap) wattmeter that I have used to measure power consumption. Moved it away and could not reproduce the symptoms and getting 10 days of uptime. Now the respective HW was repurposed... new ideas, new things to do.
Thank you all.
|
|
|
11-15-2021, 02:50 AM
|
#10
|
LQ Addict
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 24,005
|
Good news! In that case you might want to mark the thread solved.
|
|
|
All times are GMT -5. The time now is 08:28 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|