LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 05-14-2024, 09:47 AM   #1
arubin
Senior Member
 
Registered: Mar 2004
Location: Middx UK
Distribution: Slackware64 15.0 (multilib)
Posts: 1,352

Rep: Reputation: 75
Tracking down cause of crash ?graphics driver


I am trying to track down the cause of crashes on a fairly new PC. Every couple of weeks there is an episode where the mouse cursor freezes, the computer becomes unresponsive and after a few seconds the screen goes black. I suspect, though I do not know how to prove it that this is a graphics driver issue. The graphics card is a Radeon RX 6600. I am running Slackware 15 with multilib up to date with slackpkg. I use kde. My impression is that switching to Wayland reduced the frequency of crashes but not sure. I can be doing anything when the crash occurs. Memtest does not show any problems.

I haven't found anything that I found helpful in any logs but perhaps I don't know what I'm looking for.

Is there any particular thing I should be looking for?

Are there any known issues with Radeon and Linux? I had thought this kernel 5.15.145 worked well with Radeon.
 
Old 05-14-2024, 10:15 AM   #2
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,437

Rep: Reputation: 2340Reputation: 2340Reputation: 2340Reputation: 2340Reputation: 2340Reputation: 2340Reputation: 2340Reputation: 2340Reputation: 2340Reputation: 2340Reputation: 2340
I'm on similar kit with the RX 6600XT. I ran Slackware64 & Multilib for two years on 5.15.63. No issues at all.

Keep an eye on /var/log/syslog, and htop while running. When it freezes, switch to the htop window. Another trick is to set yourself on runlevel 3, & use startx. Then, when it hangs, hit Ctrl_Alt_F1 and look for output.
 
Old 05-14-2024, 10:16 AM   #3
Thom1b
Member
 
Registered: Mar 2010
Location: France
Distribution: Slackware
Posts: 486

Rep: Reputation: 339Reputation: 339Reputation: 339Reputation: 339
If your PC is new, I suggest you to use a recent kernel. When I bought my current PC, I had several hardware bugs even with latest kernel (5.12 at the time). I had to wait for 5.14 to have stable drivers.
 
Old 05-14-2024, 12:21 PM   #4
jayjwa
Member
 
Registered: Jul 2003
Location: NY
Distribution: Slackware, Termux
Posts: 795

Rep: Reputation: 255Reputation: 255Reputation: 255
The two things that come to mind is if you see something of the sort

Code:
kernel: [466414.859438] amdgpu 0000:09:00.0: amdgpu: PCIE GART of 1024M enabled (table at 0x000000F400800000).
kernel: [466415.083620] [drm:si_dpm_set_power_state [amdgpu]] *ERROR* si_enable_smc_cac failed
kernel: [466415.473694] [drm] UVD initialized successfully.
kernel: [466415.946024] amdgpu 0000:09:00.0: amdgpu: recover vram bo from shadow start
kernel: [466415.947886] amdgpu 0000:09:00.0: amdgpu: recover vram bo from shadow done
kernel: [466415.947926] [drm] Skip scheduling IBs!
kernel: [466415.948001] [drm] Skip scheduling IBs!
kernel: [466415.994745] amdgpu 0000:09:00.0: amdgpu: GPU reset(1) succeeded!
. I had these with crashes (Ryzen/AMD/Radeon HD 8570) as you describe ~ 2 days - 1 or 2 weeks without warning. Moving to Wayland, I've only had one of those since Feb. The system stays up until I shut it down for a new kernel. This was across multiple Mesa versions, kernel version, window managers, etc. The only constant was amdgpu. The other is the well-known RCU no callbacks issue, which is already documented online in various places.
 
Old 05-14-2024, 01:12 PM   #5
arubin
Senior Member
 
Registered: Mar 2004
Location: Middx UK
Distribution: Slackware64 15.0 (multilib)
Posts: 1,352

Original Poster
Rep: Reputation: 75
I have found the following. Could this be it?
Quote:
May 12 15:58:08 Lavankossot kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
May 12 15:58:09 Lavankossot last message buffered 1 times
May 12 15:58:09 Lavankossot kernel: sched: RT throttling activated
May 12 15:58:09 Lavankossot kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
May 12 15:58:12 Lavankossot last message buffered 10 times
May 12 15:58:12 Lavankossot kernel: INPUT packet died: IN=eth0 OUT= MAC=08:bf:b8:36:1a:58:e8:6f:38:31:21:95:08:00 SRC=192.168.1.166 DST=192.168.1.51 LEN=524 TOS=0x00 PREC=0x00 TTL=64 ID=8451 DF PROTO=UDP SPT=51613 DPT=54040 LEN=504
May 12 15:58:12 Lavankossot kernel: INPUT packet died: IN=eth0 OUT= MAC=08:bf:b8:36:1a:58:e8:6f:38:31:21:95:08:00 SRC=192.168.1.166 DST=192.168.1.51 LEN=306 TOS=0x00 PREC=0x00 TTL=64 ID=8524 DF PROTO=UDP SPT=40515 DPT=54040 LEN=286
May 12 15:58:13 Lavankossot kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
May 12 15:58:13 Lavankossot last message buffered 1 times
May 12 15:58:13 Lavankossot kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
May 12 15:58:13 Lavankossot kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
May 12 15:58:13 Lavankossot kernel: INPUT packet died: IN=eth0 OUT= MAC=08:bf:b8:36:1a:58:7a:9f:29:c6:f7:8a:08:00 SRC=192.168.1.227 DST=192.168.1.51 LEN=344 TOS=0x00 PREC=0x00 TTL=64 ID=62732 PROTO=UDP SPT=1900 DPT=54040 LEN=324
May 12 15:58:13 Lavankossot kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
May 12 15:58:14 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: Failed to disable gfxoff!
May 12 15:58:19 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: SMU: I'm not done with your previous command!
May 12 15:58:19 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: Failed to disable gfxoff!
May 12 15:58:24 Lavankossot kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=169044, emitted seq=169046
May 12 15:58:24 Lavankossot kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_wayland pid 1882 thread kwin_wayla:cs0 pid 1884
May 12 15:58:29 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: SMU: I'm not done with your previous command!
May 12 15:58:29 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: Failed to disable gfxoff!
May 12 15:58:30 Lavankossot kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
May 12 15:58:31 Lavankossot last message buffered 2 times
May 12 15:58:33 Lavankossot kernel: amdgpu 0000:0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
May 12 15:58:33 Lavankossot kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
May 12 15:58:34 Lavankossot kernel: amdgpu 0000:0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
May 12 15:58:34 Lavankossot kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
May 12 15:58:34 Lavankossot kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
May 12 15:58:36 Lavankossot kernel: INPUT packet died: IN=eth0 OUT= MAC=01:00:5e:00:00:fb:e8:6f:38:31:21:95:08:00 SRC=192.168.1.166 DST=224.0.0.251 LEN=68 TOS=0x00 PREC=0x00 TTL=255 ID=31962 DF PROTO=UDP SPT=5353 DPT=5353 LEN=48
May 12 15:58:40 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: SMU: I'm not done with your previous command!
May 12 15:58:40 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: Failed to disable smu features.
May 12 15:58:40 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: Fail to disable dpm features!
May 12 15:58:40 Lavankossot kernel: [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
May 12 15:58:41 Lavankossot kernel: [drm] psp gfx command DESTROY_TMR(0x7) failed and response status is (0x80000306)
May 12 15:58:46 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: SMU: I'm not done with your previous command!
May 12 15:58:46 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: GPU mode1 reset failed
May 12 15:58:46 Lavankossot kernel: amdgpu 0000:0c:00.0: amdgpu: ASIC reset failed with error, -62 for drm dev, 0000:0c:00.0
May 12 15:58:59 Lavankossot kernel: [drm] failed to load ucode SMC(0x18)
May 12 15:58:59 Lavankossot kernel: [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0x80000306)
May 12 15:59:01 Lavankossot kernel: [drm] psp gfx command AUTOLOAD_RLC(0x21) failed and response status is (0x0)
May 12 15:59:01 Lavankossot kernel: [drmsp_load_non_psp_fw [amdgpu]] *ERROR* Failed to start rlc autoload
May 12 15:59:01 Lavankossot kernel: [drmsp_resume [amdgpu]] *ERROR* PSP resume failed
May 12 15:59:01 Lavankossot kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -22
May 12 15:59:01 Lavankossot kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
May 12 15:59:01 Lavankossot last message buffered 9 times
May 12 15:59:01 Lavankossot kernel: snd_hda_intel 0000:0c:00.1: CORB reset timeout#2, CORBRP = 65535
May 12 15:59:01 Lavankossot ModemManager[2180]: <warn> could not acquire the 'org.freedesktop.ModemManager1' service name
May 12 15:59:01 Lavankossot pulseaudio[2138]: [pulseaudio] stdin-util.c: Lost I/O connection in module "module-gsettings"
May 12 15:59:11 Lavankossot kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=12351, emitted seq=12353
May 12 15:59:11 Lavankossot kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0
May 12 15:59:11 Lavankossot pulseaudio[2138]: [pulseaudio] x11wrap.c: X11 I/O error handler called
May 12 15:59:11 Lavankossot pulseaudio[2138]: [pulseaudio] x11wrap.c: X11 I/O error exit handler called, preparing to tear down X11 modules
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Root server crash: hunting down the cause of the crash kikinovak Slackware 15 01-29-2014 04:22 PM
crash () { crash|crash& }; crash grob115 Linux - Security 6 05-07-2011 03:06 AM
Crash, Crash, Crash, Crash and You Guessed it Crash! little_penguin SUSE / openSUSE 8 07-04-2005 09:34 AM
Tracking down what caused the crash... jkassemi Linux - Newbie 9 06-12-2005 12:32 PM
tracking down cause for hangs (aargh ! it's supposed to be stable!) baronsam Linux - General 2 04-30-2005 07:26 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 05:40 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration