LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (http://www.linuxquestions.org/questions/slackware-14/)
-   -   3.10.5 kernel unstable? (http://www.linuxquestions.org/questions/slackware-14/3-10-5-kernel-unstable-4175472734/)

dr.s 08-09-2013 02:03 AM

3.10.5 kernel unstable?
 
1 Attachment(s)
Been running Slackware current on this desktop with different kernels, all very stable. I upgraded to the stock 3.10.5 kernel (Aug. 7) then applied the Aug. 8 updates. The system had a complete crash a few hours ago, out of the blue, while I was reading a webpage (only Firefox, Konqueror and Konsole were loaded).

I think this is the first time I've ever seen a total crash on Linux, don't know if it's the latest kernel or a combination of other factors, all I could do was take a snapshot of the screen and do a hard reboot.

danielldaniell 08-09-2013 03:19 AM

This happens for me also, on a Dell Inspiron 5110. It would help if your image would be a bit more clear, but the last line suggests (drm_kms_helper) that it has something to do with the DRM interface. The same goes for my laptop.

It crashes when playing YouTube videos, and suspend to ram also got broken, it just flashes the caps-lock key, and blanks the screen.
How ironic the Slackware changelog for the update:
"Looks like 3.10.x got LTS status, but more importantly fixes the power issue on resume with some Intel machines.".

dunric 08-09-2013 07:16 AM

I'd suggest to temporarily disable/remove all non-stock kernel modules like virtualbox's or graphics adapter's and try if it would cause better stability. If you are running machine with an integrated Intel GA using the messy i915 driver you are probably out of luck as there were wild changes in recent past fixing various longstanding issues but bringing new ones. Some of the crap is fixed in 3.11.x but diffs cann't be applied to 3.10.x and devs are currently too busy/lazy/incompetent to backport them :(

danielldaniell 08-09-2013 07:36 AM

1 Attachment(s)
Hey dunric, thanks for the tips. I'll try to remove the vbox drivers, and see if that makes any difference. Also, I've tested 3.11-rc4, but it also panics with the same symptoms.
I've managed to take a picture of the panic, too.

AlleyTrotter 08-09-2013 09:15 AM

Same problem here!
I finally got it fixed by blacklisting module 'mei-me'
Code:

john@linux:~$ cat /etc/modprobe.d/BLACKLIST-mei.conf
# Do not load the kernel mei modules, since they interfere with S3
blacklist mei-me

Can't take credit as I got the recommendation from fearless leader PV
Of course the configuration of the kernel must include
Code:

CONFIG_INTEL_MEI=m
CONFIG_INTEL_MEI_ME=m

built as modules so's we can blacklist them.
Hope that helps
John

Sorry! Should have included that my problem also include a flood of the lines below in syslog. However the last line of my and your kernel dump are identical.
Code:

Aug  3 09:57:55 linux kernel: [ 1172.798841] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Aug  3 09:57:55 linux kernel: [ 1172.799750] mei_me 0000:00:16.0: reset: wrong host start response


danielldaniell 08-09-2013 09:57 AM

The blacklisting doesn't solve the intermittent panics for me, unfortunatelly.

brobr 08-09-2013 10:18 AM

On my box [intel i7-3632QM quad core with intel HD4000 graphics] no panic problems but ran into this bug when attaching external mass-storage (my phone) via an USB-cable

Quote:

[33775.957272] usb 1-1.2: new full-speed USB device number 9 using ehci-pci
[33776.050735] usb 1-1.2: unable to read config index 0 descriptor/start: -32
[33776.050748] usb 1-1.2: chopping to 0 config(s)
[33776.056111] usb 1-1.2: New USB device found, idVendor=0471, idProduct=1201
[33776.056119] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[33776.056132] usb 1-1.2: Product: SAMSUNG Mobile USB Modem
[33776.056133] usb 1-1.2: Manufacturer: Samsung
[33776.056134] usb 1-1.2: SerialNumber: 000000-00-000000
[33776.056301] usb 1-1.2: no configuration chosen from 0 choices
[33778.338009] usb 1-1.2: USB disconnect, device number 9
[33779.537274] usb 1-1.2: new full-speed USB device number 10 using ehci-pci
[33779.631620] usb 1-1.2: New USB device found, idVendor=04e8, idProduct=675a
[33779.631627] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[33779.631631] usb 1-1.2: Product: SAMSUNG Mobile USB Modem
[33779.631634] usb 1-1.2: Manufacturer: Samsung
[33779.631637] usb 1-1.2: SerialNumber: 000000-00-000000
[33779.633531] usb-storage 1-1.2:2.0: USB Mass Storage device detected
[33779.633689] scsi11 : usb-storage 1-1.2:2.0
[33780.635592] scsi 11:0:0:0: Direct-Access SAMSUNG B2100 0 2.00 PQ: 0 ANSI: 0
[33780.640632] scsi 11:0:0:1: Direct-Access SAMSUNG B2100 1 2.00 PQ: 0 ANSI: 0
[33780.643666] sd 11:0:0:0: [sdc] 15685632 512-byte logical blocks: (8.03 GB/7.47 GiB)
[33780.651830] sd 11:0:0:0: [sdc] Test WP failed, assume Write Enabled
[33780.654840] sd 11:0:0:1: [sdd] 16508 512-byte logical blocks: (8.45 MB/8.06 MiB)
[33780.660970] sd 11:0:0:0: [sdc] Asking for cache data failed
[33780.660973] sd 11:0:0:0: [sdc] Assuming drive cache: write through
[33780.666820] sd 11:0:0:1: [sdd] Test WP failed, assume Write Enabled
[33811.448291] usb 1-1.2: reset full-speed USB device number 10 using ehci-pci
[33811.551675] sd 11:0:0:1: [sdd] Asking for cache data failed
[33811.551679] sd 11:0:0:1: [sdd] Assuming drive cache: write through
[33842.454272] usb 1-1.2: reset full-speed USB device number 10 using ehci-pci
[33873.332232] usb 1-1.2: reset full-speed USB device number 10 using ehci-pci
[33904.338228] usb 1-1.2: reset full-speed USB device number 10 using ehci-pci
[33935.280275] usb 1-1.2: reset full-speed USB device number 10 using ehci-pci
[33966.287082] usb 1-1.2: reset full-speed USB device number 10 using ehci-pci
[33997.292107] usb 1-1.2: reset full-speed USB device number 10 using ehci-pci
[34028.234144] usb 1-1.2: reset full-speed USB device number 10 using ehci-pci
[34028.396849] sd 11:0:0:0: [sdc] Attached SCSI removable disk
[34028.406809] sd 11:0:0:1: [sdd] Attached SCSI removable disk
[34028.419333] sd 11:0:0:0: [sdc] 15685632 512-byte logical blocks: (8.03 GB/7.47 GiB)
[34028.424955] sd 11:0:0:0: [sdc] Test WP failed, assume Write Enabled
[34028.431205] sd 11:0:0:0: [sdc] Asking for cache data failed
[34028.431207] sd 11:0:0:0: [sdc] Assuming drive cache: write through
[34059.176029] usb 1-1.2: reset full-speed USB device number 10 using ehci-pci
......
On an USB3 bus it appeared exactly as a bug reported for 3.10.3:
http://comments.gmane.org/gmane.linux.usb.general/91405

Now I am trying this patch --which did not make it to 3.10.5:
http://marc.info/?l=linux-usb&m=137523956310060&w=2

.....

Ok, recompiled and rebooted..
.....

Yes, with the lines
Quote:

if (sdev->skip_vpd_pages)
goto fail;

added to <path-to-kernel-source>/drivers/scsi/scsi.c as described in the patch-file, my phone is now recognised.

Ser Olmy 08-09-2013 10:43 AM

Quote:

Originally Posted by danielldaniell (Post 5006264)
The blacklisting doesn't solve the intermittent panics for me, unfortunatelly.

From the screendump it seems the panic is related to the nouveau driver. I can't say I'm surprised, as several previous versions of that driver have caused hard lock-ups on some of my systems.

Have you tried the proprietary Nvidia drivers? The latest version should compile cleanly against a 3.10.5 kernel. Granted, these drivers have a few issues of their own, but that may not affect your particular card/chip.

danielldaniell 08-09-2013 10:51 AM

This laptop has two graphic cards, a discrete NVIDIA and an integrated Intel. I'm turning off the NVIDIA card during boot as soon as I can (via the /sys/kernel/debug/vgaswitcheroo/switch file), and just using the Intel. I don't want to have anything to do with NVIDIA cards, or with its proprietary drivers; if I could, I'd take that video card off of the laptop...

cwizardone 08-09-2013 11:40 AM

Quote:

Originally Posted by danielldaniell (Post 5006299)
...I don't want to have anything to do with NVIDIA cards, or with its proprietary drivers; if I could, I'd take that video card off of the laptop...

Sounds like a personal problem. :scratch: What did they do to you?

The 3.10.5 kernel, xorg 1.14.2, and the Nvidia 325.15 driver are working perfectly on this box (other than the audio volume problem that started after the 3.8.13 kernel).

dunric 08-09-2013 12:43 PM

Heh, me crashing too. The `mei' module issue is not yet fixed even in the last 3.10.5 kernel.
There are mei init/reset/timeout patches do work for some people. Look innocent, trying myself.

Kernel development has got some twists, also do not remember so many issues or so frequent crashes for the long time. Not only kernel but also other popular software like Firefox or GNOME are getting crippled in the recent past.

Tough decisions for Pat to avoid carrying-in instabilities in otherwise rock-solid Slackware.

Martinus2u 08-09-2013 01:33 PM

fwiw, one of my machines occasionally panics after resume (sometimes immediately, sometimes a minute later), and has been doing so since 3.9. However, since i use kernel patches and proprietary modules i don't have anyone to complain to...

edit: intel core i7 (bloomfield) system with nvidia blob

hitest 08-09-2013 02:07 PM

The 3.10.5 kernel is running fine for me on three slackware-current boxes(32 bit). The boxes have Intel video cards. All is well here.

Pentti Poytakangas 08-09-2013 03:00 PM

Hi!
I use S64-14_current with vmlinuz-3.10.5 in Asus I5 PC...Works fine.
Intel stuff work too ;)
First I use default .config file to compile.
So no problemas!

ReaperX7 08-09-2013 04:14 PM

Both my Slackware and LFS partitions use 3.10.5 without any issues at the moment. I'm even using the Slackware 3.10.5 config for LFS too.


All times are GMT -5. The time now is 07:15 PM.