X windows video freeze - KV 5.10.x
This behavior is observable for the last few weeks since Slackware64-current adopted KV 5.10.x (1<x<5). It has never been seen in any prior implementation with a similar configuration.
Random permanent lockups (onset in minutes - hours). Keyboard + all mouse buttons dead, mouse pointer moves. Restore by hard reset only (Power button). Identical issue on two machines. Lenovo ThinkPads: P15v (i7 8c/16t integrated UHD graphics, new) & X390 (i7 4c/8t integrated UHD graphics, 1 year in service) General configuration: X/Xorg default-configured, libinput, window managers (FVWM|Openbox|i3), browsers (Google-Chrome-Stable|Firefox); no display manager, nor any KDE or Xfce. (The char '|' indicates alternatives that do not matter.) General remarks: Each of these machines also carries recent Fedora 33 (KV 5.9.16) & Ubuntu 20.04.1 (KV 5.8) Gnome installations; no lockups occur. Using either of the machines to ssh as root at a console into the other seized machine shows that everything is actually running normally except the X server. kill(all) -9 on X, startx, xinit, .xinitrc has no effect. The logs in /var/log do not show any suspicious entries. Some old postings suggest intel_idle.max_cstate=1 on the kernel command line has ameliorated the problem. It is being tested, but, really should not be needed. Can anyone corroborate this? Comments? |
Quote:
|
Yes, you're not alone.
Seen the exact same issue a couple of times on a hp laptop with i3-5157u and integrated Iris 6100 GPU, running KV > 5.10. Others have posted about it too. Last time it happened to me I was just scrolling down a webpage in firefox using the scroll wheel on my mouse, so unlike chuck above, suspend wasn't a factor. However, I've not had another hang for a couple of days now, so I don't know what to think at present. Could just be on a lucky-streak. :confused: |
freezes are totally random.
Today we hope the new 5.10.6 appears in changelog , suposed to be patched. |
Kernel 5.10.5 both generic and huge.
Google Chrome Stable YouTube videos randomly die on the vine. Have to force a reboot. Kernel 5.4.84 runs all day without a hiccup |
Kernel 5.10.6 is out already.
Code:
a/kernel-firmware-20210109_d528862-noarch-1.txz: Upgraded. |
The OP describes what I'm experiencing to the letter, and it looks like there is more then one issue at play here, and that's why we have a few threads about it.
Some peoples configuration use snd-hda-intel-hdmi - and it can be seen in their xorg stack and sysrq dmesg output. One thing that is common in this case, their systems are frozen completely and SSH also stops working. For those that are able to SSH into the machine, and see xorg eating up the cpu, check these: cat /proc/<xorg-pid>/stack [<0>] rcu_barrier+0x16a/0x1f0 [<0>] i915_gem_object_unbind+0x252/0x360 [i915] [<0>] i915_gem_set_caching_ioctl+0x149/0x190 [i915] [<0>] drm_ioctl_kernel+0xaa/0xf0 [drm] [<0>] drm_ioctl+0x20f/0x3a0 [drm] [<0>] __x64_sys_ioctl+0x83/0xb0 [<0>] do_syscall_64+0x33/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 echo w > /proc/sysrq-trigger ; dmesg [61162.416709] task:kworker/7:1H state:D stack: 0 pid: 185 ppid: 2 flags:0x00004000 [61162.416744] Workqueue: events_highpri intel_atomic_cleanup_work [i915] [61162.416745] Call Trace: [61162.416748] __schedule+0x207/0x810 [61162.416750] schedule+0x46/0xb0 [61162.416751] schedule_preempt_disabled+0xa/0x10 [61162.416752] __ww_mutex_lock.constprop.0+0x2f9/0x760 [61162.416754] ? sched_clock_local+0x60/0x80 [61162.416778] intel_unpin_fb_vma+0x25/0xa0 [i915] [61162.416786] drm_atomic_helper_cleanup_planes+0x52/0x70 [drm_kms_helper] [61162.416809] intel_atomic_cleanup_work+0x67/0x110 [i915] [61162.416812] process_one_work+0x1d4/0x370 [61162.416813] worker_thread+0x4d/0x3d0 [61162.416814] ? rescuer_thread+0x3d0/0x3d0 [61162.416815] kthread+0x11b/0x140 [61162.416816] ? __kthread_bind_mask+0x60/0x60 [61162.416818] ret_from_fork+0x22/0x30 There is clearly no mention of that hdmi module, but the hang still happens! When I first encountered this issue, I tried appending idle=poll to kernel and although the issue went away, this is no way to use a laptop. The only conclusion is that its somehow power saving related, and the i915 does have some options for adjusting it but it sucks having to wait for 2 days for it to freeze up - and it has the worst timing :) Anyway, my current fix (lazy one at that) is to uninstall xf86-video-intel package, which makes xorg use the modesetting driver, and honestly, for now I can't find any real problems with that. Since I have two external monitors, they had to be reconfigured, xfce basically forgot about them - only a minor annoyance. VLC still uses intel hardware decoding, so that's good. Maybe only the fact that after a while I had my logitech mouse start scrolling really ... really slow, but the fix apparently is to re-plug it, and its good for another day. Hopefully that's it for now, and if I don't find any more issues down the road, I will probably keep xf86-video-intel uninstalled for good. Hope this helps someone, because it has been driving me nuts. |
Quote:
|
The modesetting driver has been working well for me, too. No freezes with 5.10.3 or 5.10.5 (haven't tried 5.10.6 yet).
That doesn't really explain what happened -- the xf86-video-intel driver was working fine until the 5.10.x kernels. But I haven't seen any downside to the modesetting driver, so I'm not going to complain. |
I just had problem again Kernel 5.10.5
I did cat /proc/<xorg-pid>/stack before problem [<0>] do_epoll_wait+0x53d/0x5a0 [<0>] __x64_sys_epoll_wait+0x1a/0x20 [<0>] do_syscall_64+0x33/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 After problem get with ssh login and XOrg on 92% CPU power [<0>] rcu_barrier+0x16a/0x1f0 [<0>] i915_gem_object_unbind+0x252/0x360 [i915] [<0>] i915_gem_set_caching_ioctl+0x149/0x190 [i915] [<0>] drm_ioctl_kernel+0xaa/0xf0 [drm] [<0>] drm_ioctl+0x20f/0x3a0 [drm] [<0>] __x64_sys_ioctl+0x83/0xb0 [<0>] do_syscall_64+0x33/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0>] rcu_barrier+0x16a/0x1f0 [<0>] i915_gem_object_unbind+0x252/0x360 [i915] [<0>] i915_gem_set_caching_ioctl+0x149/0x190 [i915] [<0>] drm_ioctl_kernel+0xaa/0xf0 [drm][<0>] rcu_barrier+0x16a/0x1f0 [<0>] i915_gem_object_unbind+0x252/0x360 [i915] [<0>] i915_gem_set_caching_ioctl+0x149/0x190 [i915] [<0>] drm_ioctl_kernel+0xaa/0xf0 [drm] [<0>] drm_ioctl+0x20f/0x3a0 [drm] [<0>] __x64_sys_ioctl+0x83/0xb0 [<0>] do_syscall_64+0x33/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0>] drm_ioctl+0x20f/0x3a0 [drm] [<0>] __x64_sys_ioctl+0x83/0xb0 [<0>] do_syscall_64+0x33/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 echo w > /proc/sysrq-trigger ; dmesg before ( last lines ) [ 28.757631] e1000e 0000:00:1f.6 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 28.757821] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 3499.315196] sysrq: Show Blocked State [ 3620.515561] sysrq: Show Blocked State after [ 28.757631] e1000e 0000:00:1f.6 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 28.757821] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 3499.315196] sysrq: Show Blocked State [ 3620.515561] sysrq: Show Blocked State [ 5051.405328] sysrq: Show Blocked State [ 5051.405339] task:kworker/3:1H state:D stack: 0 pid: 288 ppid: 2 flags:0x00004000 [ 5051.405397] Workqueue: events_highpri intel_atomic_cleanup_work [i915] [ 5051.405397] Call Trace: [ 5051.405401] __schedule+0x207/0x810 [ 5051.405402] schedule+0x46/0xb0 [ 5051.405417] schedule_preempt_disabled+0xa/0x10 [ 5051.405418] __ww_mutex_lock.constprop.0+0x2f9/0x760 [ 5051.405419] ? sched_clock_local+0x60/0x80 [ 5051.405440] intel_unpin_fb_vma+0x25/0xa0 [i915] [ 5051.405448] drm_atomic_helper_cleanup_planes+0x52/0x70 [drm_kms_helper] [ 5051.405468] intel_atomic_cleanup_work+0x67/0x110 [i915] [ 5051.405470] process_one_work+0x1d4/0x370 [ 5051.405472] worker_thread+0x4d/0x3d0 [ 5051.405473] ? rescuer_thread+0x3d0/0x3d0 [ 5051.405474] kthread+0x11b/0x140 [ 5051.405474] ? __kthread_bind_mask+0x60/0x60 [ 5051.405476] ret_from_fork+0x1f/0x30 [ 5094.301242] sysrq: Show Blocked State [ 5094.301252] task:kworker/3:1H state:D stack: 0 pid: 288 ppid: 2 flags:0x00004000 [ 5094.301310] Workqueue: events_highpri intel_atomic_cleanup_work [i915] [ 5094.301310] Call Trace: [ 5094.301327] __schedule+0x207/0x810 [ 5094.301328] schedule+0x46/0xb0 [ 5094.301330] schedule_preempt_disabled+0xa/0x10 [ 5094.301331] __ww_mutex_lock.constprop.0+0x2f9/0x760 [ 5094.301333] ? sched_clock_local+0x60/0x80 [ 5094.301354] intel_unpin_fb_vma+0x25/0xa0 [i915] [ 5094.301361] drm_atomic_helper_cleanup_planes+0x52/0x70 [drm_kms_helper] [ 5094.301381] intel_atomic_cleanup_work+0x67/0x110 [i915] [ 5094.301383] process_one_work+0x1d4/0x370 [ 5094.301384] worker_thread+0x4d/0x3d0 [ 5094.301385] ? rescuer_thread+0x3d0/0x3d0 [ 5094.301386] kthread+0x11b/0x140 [ 5094.301387] ? __kthread_bind_mask+0x60/0x60 [ 5094.301388] ret_from_fork+0x1f/0x30 I used setting # /etc/modprobe.d/snd_hda.conf options snd_hda_codec_hdmi enable_silent_stream=N Login with ssh possible mouse pointer moving else frozen. Have 2 screens on display ports. Other distributions and windows running without problems Hopeful this can help to solve the problem. Quote:
|
Installed KV 5.10.6.
Absolutely no improvement! No surprise; nothing in the changelog addresses the problem. Ran for many hours. Last few just on idle (no human input). Then, abruptly, the fan(s) started to spin with rapidly increasing noise representing a heavy load on the 8 cpus. Anyone else seen this? |
Others seems to see that freeze :
https://gitlab.freedesktop.org/drm/intel/-/issues/2905 That is for a debian distribution and with the 5.10.4 kernel. |
Can someone post a link to a KV 5.4.84-88? (kernel-generic + config) Wish I had saved mine; never a problem with it.
|
Quote:
|
Quote:
|
Hello all,
Just to add more info to the thread: besides the lockups running 5.10.X, X <= 7, I'm also experiencing much worse audio quality. Reverting to 5.4.84 solves both issues. |
KV bump -> 5.10.7. NB: i915/drm patch in changelog + libdrm upgrade in SW-current. 24+ hours running & nary a burp | hiccup. (Promising, but, certainly too soon to proclaim victory.) Any other reports from the community?
|
Quote:
|
Quote:
|
Quote:
upon resuming from suspend-to-ram, had no lock-ups at all on other machines including my laptop. |
I just experienced a full lockup on my desktop requiring a hard reset button. This is a fully up-to-date 64-current with 5.10.7 running kde5, rebooted yesterday after updates, suspended overnight until I resumed this morning. It worked great for maybe 3 hours until the lockup while konsole was open with mc running, dolphin running & focused on ~/Downloads while superposition was downloading & firefox with 3 tabs open. Before resetting after lockup I tried to ping & ssh from terminal on my chromebook, no joy.
On reboot my /home partition recovered orphaned inodes and my /boot/efi partition needed fsck to reset a dirty bit. Here's info center/system info copied to clipboard: Code:
Operating System: Slackware 14.2 Code:
Jan 15 06:06:24 slacker kernel: [ 0.005699] ACPI BIOS Warning (bug): Optional FADT field Pm2ControlBlock has valid Length but zero Address: 0x0000000000000000/0x1 (20200925/tbfadt-615) |
I had similar experiences, hang-ups, freezes, etc. No luck with any of the different 5.10 kernels (tested up to 5.10.7) on my NVIDIA Quadro FX 580 (nouveau driver).
All problems went away when I changed to 5.4 kernel (5.4.89). No hick-ups or any other issues with KV. |
I saw somewhere a suggestion to remove xf86-video-intel driver as it has some problems. On my setup it worked and now no freezes for few days after removal.
|
Sad report. 2021 January 15. 18:20 PST. Another lockup / freeze. KV 5.10.7 still ill.
Fedora 33 yesterday added 5.10.6 to its 5.9.16 /boot. I have that on another partition. It will take time to watch. Can anyone else contribute here on the Fedora update? Obviously many hours are needed to observe the malfunction. As others have noted prior kernels seem fine. I have a CentOS Stream 8 using 4.18 & a Ubuntu with 5.8.x series. |
Further relevant corroborating evidence with KV 5.10.6.
https://bugzilla.kernel.org/show_bug.cgi?id=211179 Quote from OP "I am running up-to-date Arch Linux on a system with an Intel I5-9400 processor, 16 GB of ram, Asus mini-ITX motherboard. I do not use a desktop system, just a window manager (bspwm), starting X with startx at the login shell. I have tried using a different window manager and have seen the same problem described below. About once/day, I am experiencing complete X freezes. The system is unresponsive to mouse or keyboard inputs, even attempts to switch consoles. I can ssh into the system from another and attempts to kill the xorg process with kill -9 fail, which suggests a driver problem to me. When I try to reboot, I get messages from systemd that it is waiting for the xorg process to die. Perhaps 5 or 10 minutes later, the system finally reboots." My main sys is similar: xorg + wm + startx on SW-current, without systemd, of course. The ssh results are what I see, too. |
Of relevance. Who is running kernel mode setting only, xorg-video-intel only, & both simultaneously? The majority of Linux users seem to do only the first; the second might be an extinct breed; the final scheme...? In the first category, there seem to be no or few problems on the latest kernels. Others have troubles. Comments?
|
Quote:
Slackware-current ships with both drivers available. In that situation, X picks the intel driver. I.e., if you install Slackware-current on a system with Intel graphics, and you haven't messed with things, you're running the intel driver. If you want to use the modesetting driver on Intel graphics hardware, you can either remove the xf86-video-intel package, or set up an Xorg config file. (See posts 2, 14, and 16 in https://www.linuxquestions.org/quest...el-4175688351/.) To answer your question: I've had zero crashes since switching to the modesetting driver with kernel 5.10.3. I'm currently on 5.10.7. |
solved
After removing xf86-video-intel package, Xorg is now running on my Intel PC with Slackware-current-64 very well with kernel 5.10.x. On sddm log-on screen it see already that two displays with different resolutions are in use, and shows the screen with the different resolutions. Looks problem is solved with the removal of xf86-video-intel-x86_64-1.txz.
Quote:
|
The Slackware leaders should provide some "official" guidance about these video / graphic drivers choices, configuration,... It is only now being clarified on LQ.
|
Quote:
|
Yeah, I think there at least a few bugs in the 5.10.x kernels. There are several threads here in the Slackware forum with different symptoms (some Intel, some nvidia, some dual monitor, ...).
My guess (just a guess) is that switching to the modesetting driver works around one bug (maybe in the kernel, maybe in the intel driver, who knows). Luckily for me, it fixes the bug I was seeing. But it sounds like there are still some other problems lurking. Someone (sorry, I forget who) pointed out that new kernel versions always have some problems, and they generally get sorted out. |
From today's (2021 Jan 19) ChangeLog.txt, PV has recognized the issue:
Quote:
|
Well I compiled kernel 5.10.9 and tried running it under Slackware 14.2. The X freezes when using Zoom returned. Besides, I'm sometimes experiencing screen artifacts when scrolling in Firefox or Palemoon. This is new.
Another bad news is that my bluetooth problems also returned with 5.10.x. My bluetooth earphone used to stop working after a while with timeout error messages in dmesg. This happened under 4.19. Only a reboot or suspend/wakeup would allow me to reconnect. It finally got fixed in 5.4. I did not try any other kernels until 5.10 came out but somewhere along the way it must have got broken again. But this time even suspend/wakeup isn't enough, I need a reboot to connect the device. It appears that when Slackware 15.0 comes with kernel 5.10 I will have to downgrade to 5.4, unless some big fixes are made in the kernel. |
Quote:
Since the modesetting driver does not support a TearFree option, you can try using a compositor to reduce/eliminate the tearing. From what I've read Compton works fairly well, but I haven't tested it myself. You can use this video to check for tearing: (don't click if you have epilepsy or something like that) http://sobukus.de/gpd/displayglitch/...test_60fps.mp4 |
Quote:
/etc/X11/xorg.conf.d/20-intel.conf with the following content: Code:
Section "Device" |
@rrj, @kgha:
Thanks for the info. I'm not running -current, it's 14.2 with my custom kernel (whose options were derived from Slackware's official 4.19 once in -current). I have an intel chipset but I have uninstalled xf86-video-intel so I'm on the drm driver. And there are no issues with 5.4.x, so this is not related to the driver nor its options. It is a kernel problem I think. |
Oh well, my other computer (a netbook with Intel graphics and running Slackware 14.2 wit a custom kernel) experiences random shutdowns with 5.10.x, usually within 5 minutes of booting. No such thing occurs under kernel 5.4. There is no lock or freeze, it is as if I have pressed the power button. This is probably not related to the X issue but it is yet another reason for me to stick with 5.4.
|
I'm having the same issue on an Intel NUC with an Intel Iris Plus Graphics 655 GPU and the lockups just come randomly. Can be as soon as a few minutes after boot or in the middle of the day.
I've tried the following to no avail: 1. appending i915.enable_psr=0 to kernel boot line 2. switching from intel to modesetting driver 3. upgrading to kernel 5.10.9 In the end I*downloaded the latest LTS 5.4.92 and compiled it. So far I've had a good 3 hours with no problem and will be leaving my machine on for the week to see how it goes. |
Hi Ilgar,
Would you please specify the processor and chipset and graphics 'card' in your netbook? Quote:
Lately, I read on Linux Questions that the xf86-video-intel driver is the suspected cause of a great many problems. Also, I see in the ChangeLogs for '-current', "Only use the Intel DDX with pre-gen4 hardware. Newer hardware will use the modesetting driver." I see the GMA950 is considered to be third generation. However, I am thinking to get the netbook out, remove the xf86-video-intel driver package, and use the netbook for a long time in order to see if the random(?) lockup problem will occur. |
Quote:
My netbook is an Asus TP200SA with an Intel Celeron N3050 processor. The bug you describe reminds me of the dreaded kernel bug #109051 which affected my previous netbook with an Atom Z3736F processor (but I do not see N270 being mentioned there). I have some good news for you: Just recently someone brought me an old laptop with an Atom N280 processor and asked if I could bring it into a useful shape for someone who needs a computer. I installed Q4OS 3.13 on it (based on Debian Buster, it has kernel 4.19). There were no problems, I even managed to run Zoom on it :). So I am optimistic about N270 also working well under kernel 4.19. |
FWIW, I've had no more graphics lockups post 5.10.4, I've stuck with the intel video driver, as I prefer it to the now default modesetting driver because of its 'tearfree' feature.
|
Spoke too soon. :(
Just had another lockup with 5.10.12 this time. Maybe I'm gonna have to swap to the modesetting driver after all and just use compositing to address tearing. It's not my preferred option, but needs must. How's modesetting been working out for people? |
2 Attachment(s)
it seems that the OpenGL compositor (either default 2.0 and/or 3.1) is not getting along with the new kernel 5.10.X.
The newest kernel 5.10.12 should be able to disable it without reaching a total system-freeze (i see my slackware systems, laptop and desktop, stalling for a few seconds and then continuing with resulting loss of all desktop effects). Switching to XRender is the other option even though I found it a little lacking in video performance, especially if you use heavy video editing resources, playing movies, etc. The surprising thing, in my opinion, is that the kernel developers have not been able to fix this annoying bug since it was first reported by me before xmas/20 with kernel 5.10.4, or maybe not given it as much attention as it deserves as it seems to affect not only nVidia video but AMD cards, native Intel graphics, single and double monitors, as reported in several linux forums. The two screenshots attached happened with about 3-4hrs difference from one-another. I am still hoping that kernel 5.11.X will fix the driver bug causing it. |
Quote:
|
Thanks rd.
I've now switched to kms/modesetting driver, with compton. I hit a problem with tiled wallpapers loaded via wmsetbg not displaying correctly under compton, but fullscreen wallpapers seem to display ok, so I can workaround that. I'd had a good couple of weeks since my last lockup using the 'intel' driver so it'll take me a while before I'll be confident there's no problem with modesetting. I will miss being able to do silly stuff like /usr/libexec/xscreensaver/glschool -root to have animated root windows when I'm feeling in the mood for it, but I can probably do something with fvwm's StayOnBottom style to simulate that. Anyway, I'll report back if I have any further issues with the kms/modesetting driver. |
Quote:
There are scattered reports in this thread that the Intel video driver will function reliably IF you have one or more kernel CL options invoked. Since the lockups / freezes are random, it could take some time to verify their validity. Ultimately, I believe that Intel would need to actively address these limitations or explicitly tell users about the inherent restrictions. |
Quote:
Intel Broadwell-U i3-5157u laptop with Iris 6100 Integrated graphics (rev 09) No external monitor, just the laptops panel. Slackware64-current and CRUX 3.6 dual boot. Kernels 5.5.0 - 5.9.16 unaffected. fvwm 2.6.9 window manager, no compositor. Use both startx (crux) and xdm (slackware). Was using xf86-video-intel (tearfree on). Had the issue on both crux and slackware, always when scrolling in firefox, but to be fair, that may just be coincidence because firefox is my most used app outside of emacs and anything in an xterm. Last time it happened I was scrolling linuxquestions.org, so it's not exactly a graphically intensive page. It's definitely just the gpu/display hanging as I have found that I can: magic-sysrq r and then blindly: ctrl-alt-f1, ctrl-alt-delete, which triggers init to "shutdown -r". |
All times are GMT -5. The time now is 03:53 PM. |