Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux? |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
04-25-2020, 04:25 PM
|
#1
|
LQ Newbie
Registered: Jun 2017
Location: Wasilla, Alaska
Distribution: Fedora, Linux Mint
Posts: 6
Rep: 
|
LinuxMint 19.3 Sudden Power-Off Under High GPU Usage
Hello folks,
I'm looking for some suggestions on where to look for troubleshooting.
Problem:
My desktop will power off (as if the power cable were pulled) when under high graphical load (gaming) with no warning or errors that I can find. The power-off is sudden, one moment I'm fine, then in a blink of an eye the screens are black and the PC is shut down. To turn the PC back on I need to toggle the switch on the PSU off, then on again. Then I can hit he power button.
No issues under normal usage, this only happens while gaming or stress-testing the GPU.
What have I tried to fix it? - RMA'd the graphics card and they sent back a new replacement.
- RMA'd the Motherboard and they sent back a new replacement.
- Upgraded my PSU from a 500w to a new 600w PSU.
- Checked all the cables to ensure they're plugged in properly and snug.
- Compiled the Unreal Engine (which takes just under an hour on my machine) with no problems. I'm assuming this was a good CPU stress test.
- Run additional stress tests via Phoronix Test Suite. No problems with CPU tests but the problem persists with Graphics tests.
- I've checked but I'm not seeing anything that would indicate a problem
- I've also run
Code:
journalctl -f > ~/journalctl.log
while replicating the problem, but again no signs of a problem.
- Ran memtest86 to test the RAM. No problems.
My PC Specs: Any suggestions on where to start looking for clues to where the problem is would be very helpful. I don't think it's an overheat issue, I've watched the sensors output while testing and it never goes into a critical temp. I'm pretty much at a loss of where else to look for a problem.
Last edited by skriptmonkey; 04-25-2020 at 11:24 PM.
|
|
|
04-26-2020, 08:47 AM
|
#2
|
LQ Guru
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,613
|
Gaming is not only high GPU stress, it's high CPU stress. The APU hitting critical temperature is an automatic trigger for power off. Things usually shut down elegantly, but depending on the BIOS it could be kneejerk.
|
|
|
04-26-2020, 08:58 AM
|
#3
|
Senior Member
Registered: Oct 2003
Location: Elgin,IL,USA
Distribution: KDE Neon
Posts: 1,272
|
What video driver are you using, the default Radeon open source drier, or the propitiatory driver from AMD?
I assume that all the cooling fans are working and that the case has good air flow. Is the fancontrol program seeing the fans and temp settings?
|
|
|
04-26-2020, 05:26 PM
|
#4
|
LQ Newbie
Registered: Jun 2017
Location: Wasilla, Alaska
Distribution: Fedora, Linux Mint
Posts: 6
Original Poster
Rep: 
|
Quote:
Originally Posted by uteck
What video driver are you using, the default Radeon open source drier, or the propitiatory driver from AMD?
I assume that all the cooling fans are working and that the case has good air flow. Is the fancontrol program seeing the fans and temp settings?
|
- I am using the default open source amdgpu driver.
- Cooling fans are all working and there is plenty of airflow into the case.
- For fan control, the only thing I know to look at is the output from sensors. I do have the fans set to "performance" mode in the BIOS.
Code:
This is under idle usage. My understanding is that "k10temp-pci-00c3" is the CPU fan.
I don't see anything for the fan at the front of the case.
$ sensors
asus-isa-0000
Adapter: ISA adapter
cpu_fan: 0 RPM
amdgpu-pci-0900
Adapter: PCI adapter
vddgfx: +0.95 V
fan1: 772 RPM (min = 0 RPM, max = 3500 RPM)
edge: +40.0°C (crit = +94.0°C, hyst = -273.1°C)
power1: 29.03 W (cap = 145.00 W)
k10temp-pci-00c3
Adapter: PCI adapter
Tdie: +31.5°C (high = +70.0°C)
Tctl: +41.5°C
|
|
|
04-26-2020, 05:55 PM
|
#5
|
Senior Member
Registered: Oct 2003
Location: Elgin,IL,USA
Distribution: KDE Neon
Posts: 1,272
|
Since it only happens when under high GPU load, it seems to be hardware or driver related.
You could try tailing /var/log/faillog, /var/log/kern.log, and /var/log/syslog, with the -f option to follow the log as it updates like so
Code:
tail -f /var/log/syslog
that way yo may be able to see the error if it does not get written to disk.
You could try the other AMD driver and see if that makes a difference in your stress test. Might be a bug in the open driver?
|
|
|
04-27-2020, 03:18 AM
|
#6
|
LQ Guru
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,613
|
@scriptmonkey: Those temperatures you showed us were at tickover, not gaming.
Let's see some temperatures from high cpu/gpu loads - 5 minutes and 10 minutes into a game. You might seriously need to upgrade the fans and cpu heatsinks. If they're not black, painting them matt black improves conductivity. That's been laboratory tested. In fact there used to be a paint sold in motorbike shops called 'cylinder head black' for the 2 strokes which were always on the edge of heat problems, the way they were driven here anyway  .
|
|
|
04-30-2020, 07:50 PM
|
#7
|
LQ Newbie
Registered: Jun 2017
Location: Wasilla, Alaska
Distribution: Fedora, Linux Mint
Posts: 6
Original Poster
Rep: 
|
So I've adjusted the fans. Instead of just selecting "Performance" mode in the BIOS I went ahead and manually set the fans all to turbo. I haven't had a chance to really test if this change made a large difference or not. I played a game for about 20 minutes today mainly just to capture the sensors output after running the game a while.
Code:
$ sensors
asus-isa-0000
Adapter: ISA adapter
cpu_fan: 0 RPM
amdgpu-pci-0900
Adapter: PCI adapter
vddgfx: +1.20 V
fan1: 771 RPM (min = 0 RPM, max = 3500 RPM)
edge: +70.0?C (crit = +94.0?C, hyst = -273.1?C)
power1: 45.16 W (cap = 145.00 W)
k10temp-pci-00c3
Adapter: PCI adapter
Tdie: +53.9?C (high = +70.0?C)
Tctl: +63.9?C
I'll do more testing when I get a chance to sit down for a while.
|
|
|
All times are GMT -5. The time now is 03:20 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|