[SOLVED] Random freezes on laptop running Debian testing and Cinnamon
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Random freezes on laptop running Debian testing and Cinnamon
Hello everybody, first post on this forum
I'm experiencing seemingly random freezes on my laptop running Debian testing and Cinnamon desktop. I've previously run Debian testing on this same laptop without problems and they only started after a disk recovery with Clonezilla and persisted through a reinstallation of Debian. Problem present with both stable and testing versions of distro. When freezing the desktop goes first and becomes unresponsive though sound keeps playing and mouse pointer keeps moving. After a few seconds screen goes black and sound stops OR mouse pointer just freezes and screen doesn't go black.
After reinstallation it seemed like using a browser (both Firefox and Chromium) would trigger a freeze, but later on starting the package manager would also trigger it. I tried to uninstall the xserver-xorg-video-intel package which seemed to work at first, but right now I'm experiencing a freeze every time i try to use Skype for linux. All in all, can't seem to pinpoint any one application that would trigger the freeze. Right now I'm typing this in a firefox window, and both firefox and chromium seem to work normally.
The laptop is quite new and successfully ran memtest86 a couple of months ago. Haven't tried it since the freezes started, though.
I've tried to skim the logs, but don't really know what I'm looking for. here's the inxi output FWIW:
It sounds to me like you restored or cloned bad bytes from a failing disk, which means, some of the programs actually have literally missing bits of themselves.
I'd reinstall all the programs, using apt-get install --reinstall [the entire package set]
There's no real way to know, if your disk that was failing was the source of the clone, you cloned the missing binary bits along with the good ones, if the clone disk itself has issues, it could have dropped bits, but it sounds to me like you have some corrupted binaries, that's my first guess.
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,524
Rep:
Since you've installed several times, and the problem persists, I would run memtest again, although memtest sometimes fails to find the error. Memtester has always worked for me. You also might want to reseat the ram. See what happens with that. And you could try a live system, like knoppix, and see what happens.
Also, newer hardware can be flaky with Linux. It takes a little while to get the drivers right. But it sounds like a memory error.
The Gigabyte Kaby Lake motherboard with AMI BIOS and UHD Graphics 630 that I bought in December locked up out of the blue when it was 5 months old. I couldn't get it to boot far enough to run memtest86 until I removed a RAM stick from one of the slots. I'm still waiting on Gigabyte to return it from RMA "repair" to make all 4 slots functional again. Does your your laptop have 2 installed RAM sticks? If so, try using only one, and test both.
I'll try reseating the ram and running memtest on both sticks. Forgot to mention in the opening post, though, I'm dual booting Win10 in which I haven't experienced the freezes (haven't really used Win10 for anything other than installing it and configuring it, so hard to say).
After posting Skype now runs normally again and I currently can't reproduce the freezes, but I'll get back after the memtest.
Reseated the RAM (just one stick), ran memtest86 for 4 passes without incident. Problem still persists, but doesn't seem to have anything to do with any specific program.
I suspect that it happened after the clonezilla operation is mere coincidence. More likely the newness is related, a defect that took time for heat to expose.
Newness also may mean there's a recommended BIOS update available.
I wouldn't be satisfied with only 4 passes of Memtest86. 10-12 hours (overnight) is what I'd do.
Overnight run of memtest86 produced no errors, and there's no UEFI update available (at least yet). Now if kernel modules could be to blame, is there some way to zero in on the culprit? Unloading modules one by one and waiting for symptoms would probably not be practical as I can go for days (or weeks, even) without freezes and then one day I'll have five.
Thanks for all the replies so far, the help is greatly appreciated!
Can you figure out what's freezing exactly? Is it just X or the whole system. E.g.: Can you CTRL-ALT-F? switch to a tty and restart the X Server?
Can you still ssh into the box from another computer?
If it is only Xorg that's crashing I'd check the Xorg logs for pointers.
If it is the whole system thats crashing, I'd look in the journal / dmesg...
Have you done a hard drive check with smartmontools?
On a separate note: When you run into these freezes, how are you recovering? I once had freezes for a period of time (these were fixed after using a newer kernel) and hard resetted the computer everytime which created a number of bad sectors on the disk. I wish I had known about https://en.wikipedia.org/wiki/Magic_SysRq_key back then...
Can you figure out what's freezing exactly? Is it just X or the whole system. E.g.: Can you CTRL-ALT-F? switch to a tty and restart the X Server?
Can you still ssh into the box from another computer?
If it is only Xorg that's crashing I'd check the Xorg logs for pointers.
If it is the whole system thats crashing, I'd look in the journal / dmesg...
I can't believe I didn't think of this earlier (newb ). CTRL+ALT+DEL is ineffective, though, as is ACPI shutdown by pressing the power button. Sometimes the freeze is preceded by a couple of graphical glitches but not always. I have to get back on this when I encounter the next freeze.
Quote:
Originally Posted by joe_2000
Have you done a hard drive check with smartmontools?
Disk seems to be fine.
Quote:
Originally Posted by joe_2000
On a separate note: When you run into these freezes, how are you recovering? I once had freezes for a period of time (these were fixed after using a newer kernel) and hard resetted the computer everytime which created a number of bad sectors on the disk. I wish I had known about https://en.wikipedia.org/wiki/Magic_SysRq_key back then...
Thank's for the tip, I've been doing hard resets so far and luckily I apparently haven't damaged the disk yet.
Sometimes the freeze is preceded by a couple of graphical glitches but not always.
This all sounds a bit like it might be related to your graphics card. Some ideas:
Search system logs for output related to your graphics card.
Run lspci to get the exact identifier and search the web for similar issues with this graphics card.
Check if you have the right graphics drivers installed.
Also might be worth to search for kernel parameters that might be helpful to tweak the behavior...
I've previously run Debian testing on this same laptop without problems and they only started after a disk recovery
Are you absolutely positive that the problem did not exist before the disk recovery? I.e. have you run Debian long enough to be confident that it did not have this problem?
What I am getting at: You might want to try a different distro altogether or at least a newer kernel. If you are not comfortable installing a custom kernel consider running Debian unstable or Arch for a while just to see what happens...
Couple of freezes again today, and during the freeze I can NOT switch to a tty by Ctrl+Alt+F(n), but I CAN still ssh into the laptop. su into root shell and tried to restart lightdm but that did nothing. Searching for similar issues with the Intel UHD 620 Kaby Lake and i915-driver I did find a blog post describing eerily similar behaviour, but the poster in question could NOT ping the the box after a freeze. There are some bug reports about this graphics card and new kernels, so might be that. Couldn't find a definitive fix, though.
dmesg did show messages
Quote:
[ 18.109653] i915 0000:00:02.0: firmware: failed to load i915/kbl_dmc_ver1_04.bin (-2)
[ 18.109661] firmware_class: See https://wiki.debian.org/Firmware for information about missing firmware
[ 18.109666] i915 0000:00:02.0: Direct firmware load for i915/kbl_dmc_ver1_04.bin failed with error -2
[ 18.109671] i915 0000:00:02.0: Failed to load DMC firmware i915/kbl_dmc_ver1_04.bin. Disabling runtime power management.
at boot, but installing the firmware-misc-nonfree -package fixed that but freezes still persist.
Couple of freezes again today, and during the freeze I can NOT switch to a tty by Ctrl+Alt+F(n), but I CAN still ssh into the laptop.
That's an interesting fact!
Quote:
Originally Posted by hobocore
su into root shell and tried to restart lightdm but that did nothing.
Hmm, it must have done something? If you cannot gracefully restart lightdm have you tried kill -9 on the lightdm process and then starting it?
Have you done a top to see if there are any processes going crazy on memory / cpu usage?
Quote:
Originally Posted by hobocore
Following messages are also shown at boot:
Google-search for those messages yielded very little, is this relevant?
This seems to be something that is there to store information about crashes. Might be worth diving into for further diagnosis...
Quote:
Originally Posted by hobocore
There are some bug reports about this graphics card and new kernels, so might be that. Couldn't find a definitive fix, though.
So it might be worth trying to install an older kernel? Can you find out what kernel version exactly you were running when you did not have these problems?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.