LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Laptop and Netbook
User Name
Password
Linux - Laptop and Netbook Having a problem installing or configuring Linux on your laptop? Need help running Linux on your netbook? This forum is for you. This forum is for any topics relating to Linux and either traditional laptops or netbooks (such as the Asus EEE PC, Everex CloudBook or MSI Wind).

Notices


Reply
  Search this Thread
Old 05-21-2024, 05:40 AM   #1
joe_2000
Senior Member
 
Registered: Jul 2012
Location: Aachen, Germany
Distribution: Void, Debian
Posts: 1,019

Rep: Reputation: 308Reputation: 308Reputation: 308Reputation: 308
AMD GPU crashing sporadically on Thinkpad T14 Gen 4


Hi. I purchased a new Thinkpad T14 Gen 4 Laptop with an AMD GPU and it sporadically crashes my X server (running Openbox on Debian Bookworm).

GPU according to lspci:
Code:
64:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Phoenix1 (rev dd)
The crash is reproducible running any of the glmark2 benchmark (or at least the scenarious build, shading and texture) at exactly the end of the benchmark.

I tested with kernels 6.1, 6.6, 6.7, 6.8 and the behavior is the same.

The crash produces the following logging:
Code:
May 21 12:18:51 laptop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=61972, emitted seq=61974
May 21 12:18:51 laptop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process glmark2 pid 3403 thread glmark2:cs0 pid 3404
May 21 12:18:51 laptop kernel: amdgpu 0000:64:00.0: amdgpu: GPU reset begin!
May 21 12:18:52 laptop kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
May 21 12:18:52 laptop kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
May 21 12:18:52 laptop kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
May 21 12:18:52 laptop kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
May 21 12:18:52 laptop kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
May 21 12:18:52 laptop kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
May 21 12:18:52 laptop kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
May 21 12:18:52 laptop kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
May 21 12:18:53 laptop kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
May 21 12:18:53 laptop kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
May 21 12:18:53 laptop kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
May 21 12:18:53 laptop kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
May 21 12:18:53 laptop kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
May 21 12:18:53 laptop kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
May 21 12:18:53 laptop kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
May 21 12:18:53 laptop kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
May 21 12:18:53 laptop kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
May 21 12:18:53 laptop kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
May 21 12:18:53 laptop kernel: [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
I also tried using the following kernel params but no joy:
Code:
iommu=soft amdgpu.runpm=0 amdgpu.sg_display=0
Installed firmware packages:
Code:
firmware-amd-graphics				install
firmware-atheros				install
firmware-intel-sound				install
firmware-iwlwifi				install
firmware-linux					install
firmware-linux-free				install
firmware-linux-nonfree				install
firmware-misc-nonfree				install
firmware-realtek				install
firmware-sof-signed				install
Any hints as to what else I could try would be much appreciated!
 
Old 05-21-2024, 10:18 AM   #2
joe_2000
Senior Member
 
Registered: Jul 2012
Location: Aachen, Germany
Distribution: Void, Debian
Posts: 1,019

Original Poster
Rep: Reputation: 308Reputation: 308Reputation: 308Reputation: 308
Some more information on my system to make this thread more discoverable through search: Based on what it says in the BIOS I am running an AMD Ryzon 7 PRO 7840U w/Radeon 780M Graphics.

I am running the default glmark2 test suite as I am writing this. The kernel I am using is still 6.8.9 without any additional kernel params.
The change that seems to have fixed my problem was installing the packages firmware-amd-graphics_20230625-2_all.deb and firmware-linux-free_20200122-4_all.deb from the Dbeian trixie repos. I do not know / did not test if really both needed to be installed from trixie.
I installed the amd graphics stuff and it told me it had removed firmware-linux-free so I decided to also install this from trixie.

I then rebooted and yey, glmark2 no longer crashes.

I am wondering if there would be a cleaner way to run up to date firmware without actually upgrading everything to trixie, so I'll not mark this as solved (yet) in case someone has any ideas.
 
Old 05-22-2024, 05:08 AM   #3
joe_2000
Senior Member
 
Registered: Jul 2012
Location: Aachen, Germany
Distribution: Void, Debian
Posts: 1,019

Original Poster
Rep: Reputation: 308Reputation: 308Reputation: 308Reputation: 308
For the sake of completeness I now tested if the kernel upgrade to 6.8.9 (which I downloaded from the sid repos) is really needed and yes - it is. Even the 6.7 kernel from bookworm-backports is not new enough to prevent the crash triggered by glmark2.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Xubuntu 22.04 on Thinkpad T14 desktop not responding after suspend thethinker Linux - Laptop and Netbook 1 05-15-2022 08:07 PM
Disable discreet AMD GPU with Dedicated AMD GPU nooobeee Linux - Hardware 13 04-19-2022 08:48 PM
[SOLVED] Microphone stopped working in Thinkpad T14/Manjaro leibnizster Linux - Hardware 5 06-29-2021 05:54 PM
how can I setup the amd GPU as a default gpu instead of intel graphics? divinefishersmith Linux - Newbie 33 08-22-2015 06:03 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Laptop and Netbook

All times are GMT -5. The time now is 11:25 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration