LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   X windows video freeze - KV 5.10.x (https://www.linuxquestions.org/questions/slackware-14/x-windows-video-freeze-kv-5-10-x-4175688283/)

rdsherman 01-09-2021 01:24 PM

X windows video freeze - KV 5.10.x
 
This behavior is observable for the last few weeks since Slackware64-current adopted KV 5.10.x (1<x<5). It has never been seen in any prior implementation with a similar configuration.

Random permanent lockups (onset in minutes - hours). Keyboard + all mouse buttons dead, mouse pointer moves. Restore by hard reset only (Power button).

Identical issue on two machines. Lenovo ThinkPads: P15v (i7 8c/16t integrated UHD graphics, new) & X390 (i7 4c/8t integrated UHD graphics, 1 year in service)

General configuration: X/Xorg default-configured, libinput, window managers (FVWM|Openbox|i3), browsers (Google-Chrome-Stable|Firefox); no display manager, nor any KDE or Xfce. (The char '|' indicates alternatives that do not matter.)

General remarks: Each of these machines also carries recent Fedora 33 (KV 5.9.16) & Ubuntu 20.04.1 (KV 5.8) Gnome installations; no lockups occur.

Using either of the machines to ssh as root at a console into the other seized machine shows that everything is actually running normally except the X server. kill(all) -9 on X, startx, xinit, .xinitrc has no effect. The logs in /var/log do not show any suspicious entries.

Some old postings suggest intel_idle.max_cstate=1 on the kernel command line has ameliorated the problem. It is being tested, but, really should not be needed.

Can anyone corroborate this? Comments?

Chuck56 01-09-2021 01:43 PM

Quote:

Originally Posted by rdsherman (Post 6205792)
This behavior is observable for the last few weeks since Slackware64-current adopted KV 5.10.x (1<x<5). It has never been seen in any prior implementation with a similar configuration.

Random permanent lockups (onset in minutes - hours). Keyboard + all mouse buttons dead, mouse pointer moves. Restore by hard reset only (Power button).

Identical issue on two machines. Lenovo ThinkPads: P15v (i7 8c/16t integrated UHD graphics, new) & X390 (i7 4c/8t integrated UHD graphics, 1 year in service)

General configuration: X/Xorg default-configured, libinput, window managers (FVWM|Openbox|i3), browsers (Google-Chrome-Stable|Firefox); no display manager, nor any KDE or Xfce. (The char '|' indicates alternatives that do not matter.)

General remarks: Each of these machines also carries recent Fedora 33 (KV 5.9.16) & Ubuntu 20.04.1 (KV 5.8) Gnome installations; no lockups occur.

Using either of the machines to ssh as root at a console into the other seized machine shows that everything is actually running normally except the X server. kill(all) -9 on X, startx, xinit, .xinitrc has no effect. The logs in /var/log do not show any suspicious entries.

Some old postings suggest intel_idle.max_cstate=1 on the kernel command line has ameliorated the problem. It is being tested, but, really should not be needed.

Can anyone corroborate this? Comments?

This morning I had an Xorg lockup after suspending overnight with a fully installed and update 64-current using the amdgpu driver. I don't think this is isolated to Intel, AMD or nVidia drivers.

GazL 01-09-2021 02:04 PM

Yes, you're not alone.

Seen the exact same issue a couple of times on a hp laptop with i3-5157u and integrated Iris 6100 GPU, running KV > 5.10. Others have posted about it too.

Last time it happened to me I was just scrolling down a webpage in firefox using the scroll wheel on my mouse, so unlike chuck above, suspend wasn't a factor.

However, I've not had another hang for a couple of days now, so I don't know what to think at present. Could just be on a lucky-streak. :confused:

USUARIONUEVO 01-09-2021 03:10 PM

freezes are totally random.

Today we hope the new 5.10.6 appears in changelog , suposed to be patched.

Gordie 01-09-2021 03:52 PM

Kernel 5.10.5 both generic and huge.
Google Chrome Stable
YouTube videos randomly die on the vine. Have to force a reboot.

Kernel 5.4.84 runs all day without a hiccup

slac 01-09-2021 03:56 PM

Kernel 5.10.6 is out already.

Code:

a/kernel-firmware-20210109_d528862-noarch-1.txz:  Upgraded.
a/kernel-generic-5.10.6-x86_64-1.txz:  Upgraded.
a/kernel-huge-5.10.6-x86_64-1.txz:  Upgraded.
a/kernel-modules-5.10.6-x86_64-1.txz:  Upgraded.

d/kernel-headers-5.10.6-x86-1.txz:  Upgraded.
k/kernel-source-5.10.6-noarch-1.txz:  Upgraded.
  Looks like the patch landed, so we'll give this another try.
  SND_HDA_INTEL_HDMI_SILENT_STREAM n -> y


mario 01-09-2021 07:04 PM

The OP describes what I'm experiencing to the letter, and it looks like there is more then one issue at play here, and that's why we have a few threads about it.

Some peoples configuration use snd-hda-intel-hdmi - and it can be seen in their xorg stack and sysrq dmesg output.
One thing that is common in this case, their systems are frozen completely and SSH also stops working.

For those that are able to SSH into the machine, and see xorg eating up the cpu, check these:

cat /proc/<xorg-pid>/stack

[<0>] rcu_barrier+0x16a/0x1f0
[<0>] i915_gem_object_unbind+0x252/0x360 [i915]
[<0>] i915_gem_set_caching_ioctl+0x149/0x190 [i915]
[<0>] drm_ioctl_kernel+0xaa/0xf0 [drm]
[<0>] drm_ioctl+0x20f/0x3a0 [drm]
[<0>] __x64_sys_ioctl+0x83/0xb0
[<0>] do_syscall_64+0x33/0x80
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

echo w > /proc/sysrq-trigger ; dmesg

[61162.416709] task:kworker/7:1H state:D stack: 0 pid: 185 ppid: 2 flags:0x00004000
[61162.416744] Workqueue: events_highpri intel_atomic_cleanup_work [i915]
[61162.416745] Call Trace:
[61162.416748] __schedule+0x207/0x810
[61162.416750] schedule+0x46/0xb0
[61162.416751] schedule_preempt_disabled+0xa/0x10
[61162.416752] __ww_mutex_lock.constprop.0+0x2f9/0x760
[61162.416754] ? sched_clock_local+0x60/0x80
[61162.416778] intel_unpin_fb_vma+0x25/0xa0 [i915]
[61162.416786] drm_atomic_helper_cleanup_planes+0x52/0x70 [drm_kms_helper]
[61162.416809] intel_atomic_cleanup_work+0x67/0x110 [i915]
[61162.416812] process_one_work+0x1d4/0x370
[61162.416813] worker_thread+0x4d/0x3d0
[61162.416814] ? rescuer_thread+0x3d0/0x3d0
[61162.416815] kthread+0x11b/0x140
[61162.416816] ? __kthread_bind_mask+0x60/0x60
[61162.416818] ret_from_fork+0x22/0x30

There is clearly no mention of that hdmi module, but the hang still happens!

When I first encountered this issue, I tried appending idle=poll to kernel and although the issue went away, this is no way to use a laptop.
The only conclusion is that its somehow power saving related, and the i915 does have some options for adjusting it but it sucks having to wait for 2 days for it to freeze up - and it has the worst timing :)

Anyway, my current fix (lazy one at that) is to uninstall xf86-video-intel package, which makes xorg use the modesetting driver, and honestly, for now I can't find any real problems with that.
Since I have two external monitors, they had to be reconfigured, xfce basically forgot about them - only a minor annoyance.
VLC still uses intel hardware decoding, so that's good.
Maybe only the fact that after a while I had my logitech mouse start scrolling really ... really slow, but the fix apparently is to re-plug it, and its good for another day.
Hopefully that's it for now, and if I don't find any more issues down the road, I will probably keep xf86-video-intel uninstalled for good.

Hope this helps someone, because it has been driving me nuts.

walecha 01-09-2021 08:31 PM

Quote:

Originally Posted by mario (Post 6205874)
Anyway, my current fix (lazy one at that) is to uninstall xf86-video-intel package, which makes xorg use the modesetting driver, and honestly, for now I can't find any real problems with that.

I'm also using modesetting (disabling xf86-video-intel) for this multigpu laptop (Intel hd4600 (GT2) + FirePro M6100) and no hangs from 5.4.x until now.

derekn13 01-09-2021 10:44 PM

The modesetting driver has been working well for me, too. No freezes with 5.10.3 or 5.10.5 (haven't tried 5.10.6 yet).

That doesn't really explain what happened -- the xf86-video-intel driver was working fine until the 5.10.x kernels. But I haven't seen any downside to the modesetting driver, so I'm not going to complain.

RudieO 01-10-2021 12:57 AM

I just had problem again Kernel 5.10.5

I did cat /proc/<xorg-pid>/stack

before problem

[<0>] do_epoll_wait+0x53d/0x5a0
[<0>] __x64_sys_epoll_wait+0x1a/0x20
[<0>] do_syscall_64+0x33/0x80
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

After problem get with ssh login and XOrg on 92% CPU power

[<0>] rcu_barrier+0x16a/0x1f0
[<0>] i915_gem_object_unbind+0x252/0x360 [i915]
[<0>] i915_gem_set_caching_ioctl+0x149/0x190 [i915]
[<0>] drm_ioctl_kernel+0xaa/0xf0 [drm]
[<0>] drm_ioctl+0x20f/0x3a0 [drm]
[<0>] __x64_sys_ioctl+0x83/0xb0
[<0>] do_syscall_64+0x33/0x80
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] rcu_barrier+0x16a/0x1f0
[<0>] i915_gem_object_unbind+0x252/0x360 [i915]
[<0>] i915_gem_set_caching_ioctl+0x149/0x190 [i915]
[<0>] drm_ioctl_kernel+0xaa/0xf0 [drm][<0>] rcu_barrier+0x16a/0x1f0
[<0>] i915_gem_object_unbind+0x252/0x360 [i915]
[<0>] i915_gem_set_caching_ioctl+0x149/0x190 [i915]
[<0>] drm_ioctl_kernel+0xaa/0xf0 [drm]
[<0>] drm_ioctl+0x20f/0x3a0 [drm]
[<0>] __x64_sys_ioctl+0x83/0xb0
[<0>] do_syscall_64+0x33/0x80
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] drm_ioctl+0x20f/0x3a0 [drm]
[<0>] __x64_sys_ioctl+0x83/0xb0
[<0>] do_syscall_64+0x33/0x80
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

echo w > /proc/sysrq-trigger ; dmesg before ( last lines )

[ 28.757631] e1000e 0000:00:1f.6 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[ 28.757821] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 3499.315196] sysrq: Show Blocked State
[ 3620.515561] sysrq: Show Blocked State

after
[ 28.757631] e1000e 0000:00:1f.6 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[ 28.757821] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 3499.315196] sysrq: Show Blocked State
[ 3620.515561] sysrq: Show Blocked State
[ 5051.405328] sysrq: Show Blocked State
[ 5051.405339] task:kworker/3:1H state:D stack: 0 pid: 288 ppid: 2 flags:0x00004000
[ 5051.405397] Workqueue: events_highpri intel_atomic_cleanup_work [i915]
[ 5051.405397] Call Trace:
[ 5051.405401] __schedule+0x207/0x810
[ 5051.405402] schedule+0x46/0xb0
[ 5051.405417] schedule_preempt_disabled+0xa/0x10
[ 5051.405418] __ww_mutex_lock.constprop.0+0x2f9/0x760
[ 5051.405419] ? sched_clock_local+0x60/0x80
[ 5051.405440] intel_unpin_fb_vma+0x25/0xa0 [i915]
[ 5051.405448] drm_atomic_helper_cleanup_planes+0x52/0x70 [drm_kms_helper]
[ 5051.405468] intel_atomic_cleanup_work+0x67/0x110 [i915]
[ 5051.405470] process_one_work+0x1d4/0x370
[ 5051.405472] worker_thread+0x4d/0x3d0
[ 5051.405473] ? rescuer_thread+0x3d0/0x3d0
[ 5051.405474] kthread+0x11b/0x140
[ 5051.405474] ? __kthread_bind_mask+0x60/0x60
[ 5051.405476] ret_from_fork+0x1f/0x30
[ 5094.301242] sysrq: Show Blocked State
[ 5094.301252] task:kworker/3:1H state:D stack: 0 pid: 288 ppid: 2 flags:0x00004000
[ 5094.301310] Workqueue: events_highpri intel_atomic_cleanup_work [i915]
[ 5094.301310] Call Trace:
[ 5094.301327] __schedule+0x207/0x810
[ 5094.301328] schedule+0x46/0xb0
[ 5094.301330] schedule_preempt_disabled+0xa/0x10
[ 5094.301331] __ww_mutex_lock.constprop.0+0x2f9/0x760
[ 5094.301333] ? sched_clock_local+0x60/0x80
[ 5094.301354] intel_unpin_fb_vma+0x25/0xa0 [i915]
[ 5094.301361] drm_atomic_helper_cleanup_planes+0x52/0x70 [drm_kms_helper]
[ 5094.301381] intel_atomic_cleanup_work+0x67/0x110 [i915]
[ 5094.301383] process_one_work+0x1d4/0x370
[ 5094.301384] worker_thread+0x4d/0x3d0
[ 5094.301385] ? rescuer_thread+0x3d0/0x3d0
[ 5094.301386] kthread+0x11b/0x140
[ 5094.301387] ? __kthread_bind_mask+0x60/0x60
[ 5094.301388] ret_from_fork+0x1f/0x30

I used setting # /etc/modprobe.d/snd_hda.conf
options snd_hda_codec_hdmi enable_silent_stream=N
Login with ssh possible mouse pointer moving else frozen.
Have 2 screens on display ports.
Other distributions and windows running without problems

Hopeful this can help to solve the problem.

Quote:

Originally Posted by mario (Post 6205874)
The OP describes what I'm experiencing to the letter, and it looks like there is more then one issue at play here, and that's why we have a few threads about it.

Some peoples configuration use snd-hda-intel-hdmi - and it can be seen in their xorg stack and sysrq dmesg output.
One thing that is common in this case, their systems are frozen completely and SSH also stops working.

For those that are able to SSH into the machine, and see xorg eating up the cpu, check these:

cat /proc/<xorg-pid>/stack

[<0>] rcu_barrier+0x16a/0x1f0
[<0>] i915_gem_object_unbind+0x252/0x360 [i915]
[<0>] i915_gem_set_caching_ioctl+0x149/0x190 [i915]
[<0>] drm_ioctl_kernel+0xaa/0xf0 [drm]
[<0>] drm_ioctl+0x20f/0x3a0 [drm]
[<0>] __x64_sys_ioctl+0x83/0xb0
[<0>] do_syscall_64+0x33/0x80
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9


I got


echo w > /proc/sysrq-trigger ; dmesg

[61162.416709] task:kworker/7:1H state:D stack: 0 pid: 185 ppid: 2 flags:0x00004000
[61162.416744] Workqueue: events_highpri intel_atomic_cleanup_work [i915]
[61162.416745] Call Trace:
[61162.416748] __schedule+0x207/0x810sn-hda
[61162.416750] schedule+0x46/0xb0
[61162.416751] schedule_preempt_disabled+0xa/0x10
[61162.416752] __ww_mutex_lock.constprop.0+0x2f9/0x760
[61162.416754] ? sched_clock_local+0x60/0x80
[61162.416778] intel_unpin_fb_vma+0x25/0xa0 [i915]
[61162.416786] drm_atomic_helper_cleanup_planes+0x52/0x70 [drm_kms_helper]
[61162.416809] intel_atomic_cleanup_work+0x67/0x110 [i915]
[61162.416812] process_one_work+0x1d4/0x370
[61162.416813] worker_thread+0x4d/0x3d0
[61162.416814] ? rescuer_thread+0x3d0/0x3d0
[61162.416815] kthread+0x11b/0x140
[61162.416816] ? __kthread_bind_mask+0x60/0x60
[61162.416818] ret_from_fork+0x22/0x30

There is clearly no mention of that hdmi module, but the hang still happens!

When I first encountered this issue, I tried appending idle=poll to kernel and although the issue went away, this is no way to use a laptop.
The only conclusion is that its somehow power saving related, and the i915 does have some options for adjusting it but it sucks having to wait for 2 days for it to freeze up - and it has the worst timing :)

Anyway, my current fix (lazy one at that) is to uninstall xf86-video-intel package, which makes xorg use the modesetting driver, and honestly, for now I can't find any real problems with that.
Since I have two external monitors, they had to be reconfigured, xfce basically forgot about them - only a minor annoyance.
VLC still uses intel hardware decoding, so that's good.
Maybe only the fact that after a while I had my logitech mouse start scrolling really ... really slow, but the fix apparently is to re-plug it, and its good for another day.
Hopefully that's it for now, and if I don't find any more issues down the road, I will probably keep xf86-video-intel uninstalled for good.

Hope this helps someone, because it has been driving me nuts.


rdsherman 01-10-2021 07:28 PM

Installed KV 5.10.6.

Absolutely no improvement!

No surprise; nothing in the changelog addresses the problem.

Ran for many hours. Last few just on idle (no human input).

Then, abruptly, the fan(s) started to spin with rapidly increasing noise representing a heavy load on the 8 cpus.

Anyone else seen this?

BrunoLafleur 01-11-2021 04:12 AM

Others seems to see that freeze :
https://gitlab.freedesktop.org/drm/intel/-/issues/2905

That is for a debian distribution and with the 5.10.4 kernel.

rdsherman 01-12-2021 11:08 AM

Can someone post a link to a KV 5.4.84-88? (kernel-generic + config) Wish I had saved mine; never a problem with it.

Markus Wiesner 01-12-2021 11:46 AM

Quote:

Originally Posted by rdsherman (Post 6206902)
Can someone post a link to a KV 5.4.84-88? (kernel-generic + config)

https://slackware.uk/cumulative/slac...slackware64/a/ (Reference)

rdsherman 01-12-2021 03:38 PM

Quote:

Originally Posted by Markus Wiesner (Post 6206924)

Thank you!


All times are GMT -5. The time now is 07:18 AM.