LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 03-20-2021, 11:21 AM   #31
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,292

Rep: Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322

Have you tried slowing the cpu frequency? Reseating the ram?

What the logs show us is errors related to one particular bank of ram, and some (probably kernel) process going out to lunch. You need to approach this logically, eliminate what you can. Have you ram in Bank 5? If not, ignore it. If so, swap it.

This sort of thing is solved by "divide & conquer" techniques.

It's a while since I had hardware issues but there's a pile of things you can do with special debugging kernel keys. There's also a good choice of boot options to nobble various functions. stop running the processes you always run, and see if any of them ease your troubles. Add them back, and test again.

EDIT: We did clarify that the kernel does not directly support your wifi, didn't we? Did you use the Realtek source code to compile your wifi driver? And remember Gnu/linux ≠ Darwin 64. Which is the driver for?

Also, somebody else probably had this fault, etc. Read his experience, if it's out there.

Last edited by business_kid; 03-20-2021 at 11:29 AM.
 
Old 03-20-2021, 11:44 AM   #32
jsbjsb001
Senior Member
 
Registered: Mar 2009
Location: Earth, unfortunately...
Distribution: Currently: OpenMandriva. Previously: openSUSE, PCLinuxOS, CentOS, among others over the years.
Posts: 3,881

Rep: Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063
Your posted kernel log output mentions the amdgpu driver having problems, and from my research and own experiences, it's bugs in the amdgpu kernel driver. This would explain the freezing you're having, although I can't say if amdgpu is the only problem with your system though.

In relation to your particular error messages from amdgpu, you might see if the following helps from this Gentoo forum thread quoted below. You might also want to check out this Gentoo forum thread too.

Quote:
Since I installed a Radeon RX570 graphics card and got the AMDGPU driver working, I've had annoying 10 sec delays, when booting and shutdown, and when switching sessions between tty7 and tty 8. Dmesg shows the driver trying to send a series of amdgpu powerplay commands and failing, along the following lines:
Code:
[13842.569209] amdgpu: [powerplay]
                failed to send message 171 ret is 0
I found the hangs and messages disappear set the amdgpu module parameter dpm=0, either with a /etc/modprobe.d entry along the lines:
Code:
options amdgpu dpm=0
or the command line parameter
Code:
amdgpu.dpm=0
I don't know if it's disabling something important or not. Xorg.0.log shows
Code:
[    22.232] (II) AMDGPU(0): DPMS capabilities: Off
[    22.535] (==) AMDGPU(0): DPMS enabled
[    22.547] (II) Initializing extension DPMS
but I'm not sure if DPMS (Energy Star power saving) is the same thing as dpm.

I found an intriguing reference that said on old cards and kernels dpm=1 enabled the new dpm; then when AMD power play came out, they swapped its definition and dpm=0 would select power play and dpm=1 would still select the old power management, which might explain the problem.

Last edited by jsbjsb001; 03-20-2021 at 11:45 AM. Reason: grammer fix
 
1 members found this post helpful.
Old 03-31-2021, 09:26 AM   #33
dosensuppe
Member
 
Registered: Feb 2021
Location: Germany
Distribution: Artix Linux, Slackware, Gentoo
Posts: 83

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by jsbjsb001 View Post
Your posted kernel log output mentions the amdgpu driver having problems, and from my research and own experiences, it's bugs in the amdgpu kernel driver. This would explain the freezing you're having, although I can't say if amdgpu is the only problem with your system though.

In relation to your particular error messages from amdgpu, you might see if the following helps from this Gentoo forum thread quoted below. You might also want to check out this Gentoo forum thread too.
thanks. I now disabled dmp in /etc/modprobe.d/amdgpu.conf with
Code:
options amdgpu dpm=0
Let's see if this fixes something.
But unfortunately I think it's an independent problem from the total reboot freezes I get.
 
Old 04-06-2021, 03:06 PM   #34
dosensuppe
Member
 
Registered: Feb 2021
Location: Germany
Distribution: Artix Linux, Slackware, Gentoo
Posts: 83

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by dosensuppe View Post
thanks. I now disabled dmp in /etc/modprobe.d/amdgpu.conf with
Code:
options amdgpu dpm=0
Let's see if this fixes something.
But unfortunately I think it's an independent problem from the total reboot freezes I get.
Had to delete the paramater again.
It causes terrible performance slowdowns in video games.
 
Old 04-07-2021, 03:24 AM   #35
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,292

Rep: Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322
At 33 posts on this thread, it's not something simple.

Time surely to brae yourself, download and build the latest stable kernel, & repeat tests. If the problem still exists, and prepare for an exchange with a moody developer,

I had one bug where I had to take on a hardware manufacturer, followed by a kernel dev. It turned out the hardware was at fault throwing spurious warnings, so the kernel dev altered his code to ignore it. He had been looking for this bug for years.

But things get sorted, because yours is the sort of feedback they need to make things better. Just develop a thick skin and you'll be fine.
 
1 members found this post helpful.
Old 04-14-2021, 10:12 AM   #36
dosensuppe
Member
 
Registered: Feb 2021
Location: Germany
Distribution: Artix Linux, Slackware, Gentoo
Posts: 83

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by business_kid View Post
At 33 posts on this thread, it's not something simple.

Time surely to brae yourself, download and build the latest stable kernel, & repeat tests. If the problem still exists, and prepare for an exchange with a moody developer,

I had one bug where I had to take on a hardware manufacturer, followed by a kernel dev. It turned out the hardware was at fault throwing spurious warnings, so the kernel dev altered his code to ignore it. He had been looking for this bug for years.

But things get sorted, because yours is the sort of feedback they need to make things better. Just develop a thick skin and you'll be fine.
Thank you. I guess sometimes you just get unlucky with specific hardware combinations. At this point I don't even know if I have any guarantee left for the mainboard, which I'm starting to believe is the culprit here.
 
Old 04-14-2021, 12:07 PM   #37
dosensuppe
Member
 
Registered: Feb 2021
Location: Germany
Distribution: Artix Linux, Slackware, Gentoo
Posts: 83

Original Poster
Rep: Reputation: Disabled
you think I can report this to a kernel dev directly?
 
Old 04-15-2021, 06:19 AM   #38
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,292

Rep: Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322
I have done so more than once over the years, and got results too. The deal is, you've got the problem hardware, you're seeing the bug, and they want to fix the bug.

Download the latest stable kernel, build it using your current config. If it pukes, file the bug against that kernel, & previous ones. Don't get shirty if whatever dev is dealing with you seems rude - they don't hold your hand. After admitting I am no programmer, I corrected a very basic C syntax error on one patch and that caused a storm. But I had built a kernel with his faulty patch tested, & reported results; I repeated for his patch with my fix, which worked. He didn't like it one bit, but my syntax fix went in.
IIRC if was "if <condition>" --> "elif <condition>"
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] System completely freezes, what is broken and needs replacing? AndrewAmmerlaan Linux - Hardware 7 01-27-2021 02:46 PM
Cpu fan stuck to the cpu, how do I get the cpu out? abefroman Linux - Hardware 16 09-04-2009 12:47 AM
xorg / startx completely freezes system every time granth Slackware 17 11-06-2008 10:42 PM
Startx freezes system completely, mysterious white dots Dymitry Slackware 5 02-19-2007 04:41 PM
system freezes completely c-- Ubuntu 2 03-29-2005 04:23 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 08:21 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration