[SOLVED] Kernel update did something weird to X. drm issue?
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Kernel update did something weird to X. drm issue?
I wasn't sure where to post this. I think general software seems the most appropriate.
A little background. The distro is AntiX, recently upgraded to AntiX-17 with no problems. It was still running on the old AntiX-16 kernel (4.4.10) so I installed the latest kernel in the Debian repository (4.9.0), then chose exit/reboot. Immediately I noticed something weird: the video went mad with coloured lights. Then it shut down and rebooted.
Everything on the reboot looked normal initially. There were no error messages and slim (the display manager) came up normally. I logged in and the video went black. And that was that. X was borked, no display, no keyboard.
I rebooted into single-user mode and copied over the Xorg logs for the abnormal shutdown and startup. The tail of the shutdown one is attached. As you can see, there was a segfault in X.
For the new startup, it got as far as registering the mouse normally, then printed "Backtrace:" and went dead. Another segfault, I guess.
As it clearly has something to do with the new kernel, I am tentatively thinking "drm" because that's where the kernel and X intersect. And as the problem only manifests after the graphical login, there must be some transfer of ownership of the hardware at that point.
If anyone can suggest any further tests, please do, but remember that this is a different machine upstairs so I can't carry out tests instantly.
Update: I booted again, got the slim login page, but went to console instead (keyboard still works at this stage). Looked at dmesg, nothing unusual. Stopped slim by using its initscript in /etc/init.d, then tried startx. Screen went black, everything dead as before. Next time I'll try startx as root, see what that does. I'm shutting down for the day.
Last edited by hazel; 01-04-2018 at 02:03 AM.
Reason: Update added
Looks to myself more a glibc issue and how your x server was build (most likely a faulty build)
off topic, just asking because i am a nerd: Any reason why it reference i386? and not i686 or amd64?
I think those two saved in a file will help us the most to pinpoint the issue.
Quote:
cat /var/log/Xorg.0.log
Quote:
dmesg
TTY keyboard should always work. That is "init 3" in the old days. And that has nothing to do with your X server issues. usually you see the most what the X server does in /var/log/Xorg.0.log
I would also consider going back to kernel 4.4 branch. I have had recently also my share of bad kernels for quite a while. Older long term kernels are usually more mature and usable
as i checked a few hours kernel.org => kernel 4.9.73 or 4.9.74 is the latest long term one. 4.9.0 is very very old in my point of view
Looks to myself more a glibc issue and how your x server was build (most likely a faulty build)
off topic, just asking because i am a nerd: Any reason why it reference i386? and not i686 or amd64?
This is a very old computer. It doesn't do 64-bit.
Quote:
TTY keyboard should always work. That is "init 3" in the old days. And that has nothing to do with your X server issues. usually you see the most what the X server does in /var/log/Xorg.0.log
TTY keyboard only works if you can get to a TTY. But once the keyboard has failed in X, you can't go back to a console because you need a working keyboard to do that! I can switch to a console before the failure, but not afterwards.
Quote:
I would also consider going back to kernel 4.4 branch. I have had recently also my share of bad kernels for quite a while. Older long term kernels are usually more mature and usable.
Yes, I'll try that. I also want to try getting X up as root under the new kernel. Obviously you can't work like that on a permanent basis, but if it bypasses the X failure, then we have a permissions issue of some kind involving hardware devices. And that would switch the spotlight onto udev. AntiX-16 uses a systemd-free version of udev, AntiX-17 uses eudev. I don't know which one I have right now after my grand update but I definitely need to find out.
There's nothing abnormal about the Xorg logs except the way they end, and I've already described that.
I thought I had installed eudev, but apparently I hadn't. I get absent-minded these days. So I installed it and removed udev and libudev1. Unfortunately that hasn't solved the problem.
I've just tried stopping slim and using startx from root, following my hunch that this was some kind of hardware permissions problem, but that doesn't work either.
The weird thing is that if I use GRUB's advanced options menu to boot my old kernel, X works just fine and I can log in normally.
When it fails at slim login, the failure occurs just after the mouse pointer appears. With startx, I don't even see a mouse pointer.
I'm attaching a complete failed Xorg.0.log to this post. Maybe someone will see something that I can't.
Distribution: antiX using herbstluftwm, fluxbox, IceWM and jwm.
Posts: 631
Rep:
A long shot - install, if you haven't already, xserver-xorg-legacy.
Otherwise I would just keep to using the older, working kernel. Maybe the hardware doesn't like anything newer than 4.4
Well, I finally found out what it was. But I still don't know why it made such a difference.
I had booted from the old kernel and was in synaptic, checking for xserver-xorg-legacy. Turns out I had this installed, so I thought "End of the road! I've just got a rubbish kernel. Let's get rid of it." So I went to the kernel section and I noticed that there were three versions of that kernel and I had installed the non-pae one. I did a quick check of proc/cpuinfo and it turns out I do have pae, so I removed that kernel and installed a pae version instead. And now everything works.
But why should it matter to X whether the kernel uses pae?
Before directly suspecting X, I would suspect the driver of the graphics card. That driver is a kernel module. When the used driver does not handle the graphics card correctly, after entering the graphic mode unexpected machine reboots can occur, for example, when the screensaver is launched. If your machine is 32 bits, you can expect the graphics card to be 32 bits too.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.