LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 01-02-2011, 06:13 PM   #31
EdGr
Member
 
Registered: Dec 2010
Location: California, USA
Distribution: Slackware
Posts: 205

Rep: Reputation: 83

Quote:
Originally Posted by hitest View Post
Last crash, snapped a picture.
That looks familiar.

I have seen several different ways in which the kernel can crash. Since patching udev-165, I haven't seen any more crashes.
Ed
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 01-03-2011, 09:41 AM   #32
smoooth103
Member
 
Registered: Aug 2009
Location: NC, USA
Distribution: Slackware (64 bit)
Posts: 242

Rep: Reputation: 62
Is it safe to just downgrade to udev-164? Do we expect the next slackware update to roll back to udev-164 until the bug is fixed, or not?
 
Old 01-03-2011, 10:02 AM   #33
giberg
Member
 
Registered: Aug 2008
Location: Italy
Distribution: Slackware 13.0
Posts: 34

Rep: Reputation: 16
Same problem during boot a few days ago...
 
Old 01-03-2011, 09:03 PM   #34
afreitascs
Member
 
Registered: Aug 2004
Distribution: Debian
Posts: 443

Rep: Reputation: 30
I use Slack-64-m, but I have another partition Slack32-current and
update it today, after some boots I got a kernel panic.

Since the partition which use slack64-M-current (I'm using 64 now)
I took a look in / var / log / syslog (the slack32-current).
Below is it ..

+========================================+

Code:
Jan  3 23:23:13 base2 kernel: [    0.157045] raid6: int32x1   1281 MB/s
Jan  3 23:23:13 base2 kernel: [    0.174046] raid6: int32x2   1125 MB/s
Jan  3 23:23:13 base2 kernel: [    0.191042] raid6: int32x4    757 MB/s
Jan  3 23:23:13 base2 kernel: [    0.208002] raid6: int32x8    746 MB/s
Jan  3 23:23:13 base2 kernel: [    0.225018] raid6: mmxx1     2164 MB/s
Jan  3 23:23:13 base2 kernel: [    0.242001] raid6: mmxx2     3855 MB/s
Jan  3 23:23:13 base2 udevd[1599]: bind failed: Address already in use 
Jan  3 23:23:13 base2 udevd[1599]: error binding control socket, seems udevd is already running 
Jan  3 23:23:13 base2 kernel: [    0.259019] raid6: sse1x1    2246 MB/s
Jan  3 23:23:13 base2 kernel: [    0.276009] raid6: sse1x2    3617 MB/s
Jan  3 23:23:13 base2 kernel: [    0.293007] raid6: sse2x1    3832 MB/s
Jan  3 23:23:13 base2 kernel: [    0.310011] raid6: sse2x2    5042 MB/s
Jan  3 23:23:13 base2 kernel: [    0.310015] raid6: using algorithm sse2x2 (5042 MB/s)
Jan  3 23:23:13 base2 kernel: [    0.313378] pnp 00:02: disabling [mem 0x00000000-0x00000fff window] because it overlaps 0000:00:00.0 BAR 3 [mem 0x00000000-0x1fffffff 64bit]
Jan  3 23:23:13 base2 kernel: [    0.313378] pnp 00:02: disabling [mem 0x00000000-0x00000fff window disabled] because it overlaps 0000:01:00.0 BAR 6 [mem 0x00000000-0x0007ffff pref]
Jan  3 23:23:13 base2 kernel: [    0.313378] pnp 00:02: disabling [mem 0x00000000-0x00000fff window disabled] because it overlaps 0000:02:00.0 BAR 6 [mem 0x00000000-0x0000ffff pref]
Jan  3 23:23:13 base2 kernel: [    0.316114] pnp 00:0c: disabling [mem 0x000d2a00-0x000d3fff] because it overlaps 0000:00:00.0 BAR 3 [mem 0x00000000-0x1fffffff 64bit]
Jan  3 23:23:13 base2 kernel: [    0.316123] pnp 00:0c: disabling [mem 0x000f0000-0x000f7fff] because it overlaps 0000:00:00.0 BAR 3 [mem 0x00000000-0x1fffffff 64bit]
Jan  3 23:23:13 base2 kernel: [    0.316130] pnp 00:0c: disabling [mem 0x000f8000-0x000fbfff] because it overlaps 0000:00:00.0 BAR 3 [mem 0x00000000-0x1fffffff 64bit]
Jan  3 23:23:13 base2 kernel: [    0.316137] pnp 00:0c: disabling [mem 0x000fc000-0x000fffff] because it overlaps 0000:00:00.0 BAR 3 [mem 0x00000000-0x1fffffff 64bit]
Jan  3 23:23:13 base2 kernel: [    0.316144] pnp 00:0c: disabling [mem 0x00000000-0x0009ffff] because it overlaps 0000:00:00.0 BAR 3 [mem 0x00000000-0x1fffffff 64bit]
Jan  3 23:23:13 base2 kernel: [    0.316151] pnp 00:0c: disabling [mem 0x00100000-0x7fddffff] because it overlaps 0000:00:00.0 BAR 3 [mem 0x00000000-0x1fffffff 64bit]
Jan  3 23:23:13 base2 kernel: [    0.951021] pci 0000:00:13.0: OHCI: BIOS handoff failed (BIOS bug?) 00000184
Jan  3 23:23:13 base2 kernel: [    0.988634] highmem bounce pool size: 64 pages
Jan  3 23:23:13 base2 kernel: [    0.991832] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
Jan  3 23:23:13 base2 kernel: [    0.992165] DLM (built Oct 11 2010 14:46:35) installed
Jan  3 23:23:13 base2 kernel: [    0.997371] OCFS2 User DLM kernel interface loaded
Jan  3 23:23:13 base2 kernel: [    0.998386] GFS2 (built Oct 11 2010 14:47:37) installed
Jan  3 23:23:13 base2 kernel: [    1.007672] Console: switching to colour frame buffer device 128x48
Jan  3 23:23:13 base2 kernel: [    1.595357] Compaq SMART2 Driver (v 2.6.0)
Jan  3 23:23:13 base2 kernel: [    1.599995] scsi: <fdomain> Detection failed (no card)
Jan  3 23:23:13 base2 kernel: [    1.608948] Emulex LightPulse Fibre Channel SCSI driver 8.3.12
Jan  3 23:23:13 base2 kernel: [    1.610431] Copyright(c) 2004-2009 Emulex.  All rights reserved.
Jan  3 23:23:13 base2 kernel: [    1.635040] Failed initialization of WD-7000 SCSI card!
Jan  3 23:23:13 base2 kernel: [    1.672485] GDT-HA: Storage RAID Controller Driver. Version: 3.05
Jan  3 23:23:13 base2 kernel: [    1.674207] 3ware Storage Controller device driver for Linux v1.26.02.003.
Jan  3 23:23:13 base2 kernel: [    1.675841] 3ware 9000 Storage Controller device driver for Linux v2.26.02.014.
Jan  3 23:23:13 base2 kernel: [    2.172523] PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
Jan  3 23:23:13 base2 kernel: [    2.179096] ata1: softreset failed (device not ready)
Jan  3 23:23:13 base2 kernel: [    2.179099] ata1: applying SB600 PMP SRST workaround and retrying
Jan  3 23:23:13 base2 kernel: [    2.445517]  sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 >
Jan  3 23:23:13 base2 kernel: [    2.605160] registered taskstats version 1
Jan  3 23:23:13 base2 kernel: [    2.642812] EXT3-fs (sda6): error: couldn't mount because of unsupported optional features (240)
Jan  3 23:23:13 base2 kernel: [    2.644658] EXT2-fs (sda6): error: couldn't mount because of unsupported optional features (240)
Jan  3 23:23:13 base2 kernel: [    2.674769] VFS: Mounted root (ext4 filesystem) readonly on device 8:6.
Jan  3 23:23:13 base2 kernel: [    4.056908] ACPI: resource piix4_smbus [io  0x0b00-0x0b07] conflicts with ACPI region SOR1 [??? 0x00000b00-0x00000b0f flags 0x31]
Jan  3 23:23:13 base2 kernel: [    4.062051] k8temp 0000:00:18.3: Temperature readouts might be wrong - check erratum #141
Jan  3 23:23:13 base2 kernel: [    5.247158] nvidia: module license 'NVIDIA' taints kernel.
Jan  3 23:23:13 base2 kernel: [    5.249073] Disabling lock debugging due to kernel taint
Jan  3 23:23:13 base2 kernel: [    6.276402] NVRM: loading NVIDIA UNIX x86 Kernel Module  260.19.21  Thu Nov  4 20:24:24 PDT 2010
Jan  3 23:23:20 base2 console-kit-daemon[1792]: WARNING: Failed to acquire org.freedesktop.ConsoleKit 
Jan  3 23:23:20 base2 console-kit-daemon[1792]: WARNING: Could not acquire name; bailing out 
Jan  3 23:24:19 base2 python: hp-systray[2271]: warning: No hp: or hpfax: devices found in any installed CUPS queue. Exiting.
+========================================+
 
Old 01-03-2011, 10:08 PM   #35
aaazen
Member
 
Registered: Dec 2009
Posts: 357

Rep: Reputation: Disabled
Update Wed Jan 5 2011:

Okay so the bug is back!

I am running slackware current and got crashes twice this morning.

This has to be fixed!

If I were a new user I would give up on Linux and go back to Windows...

In the old days we used to make fun of Windows because of the "Blue Screen of Death" would occur so often.

This is exactly what I am seeing now with Linux!

-----------------------------------------------------
Jan 3 2010:

I'm beginning to think that my random kernel oops are caused by my hardware.

When my fairly new ps/2 mouse is connected to a KVM then I get the random kernel oops.

And at other times the mouse is misconfigured by the kernel as a keyboard and does not work as a mouse.

When I disconnect the KVM and plug the mouse directly into the motherboard, then it works fine.

I have gone back to Slackware current using udev-165 and kernel 2.6.35.7

So far no kernel oops and the mouse is recognized as a mouse.

There is still a kernel bug in there and it needs to be fixed, but hopefully it is not a very common bug.

A bad mouse should not be allowed to cause a kernel oops.

Last edited by aaazen; 01-05-2011 at 09:55 AM. Reason: We need to fix this!
 
Old 01-04-2011, 03:12 AM   #36
resonance
LQ Newbie
 
Registered: Dec 2010
Posts: 4

Rep: Reputation: 6
Hi all,
A follow-up to post #25. The problem is probably in the kernel scsi/sg code, specifically supporting "ATA pass-through" functionality. What's new in udev-165 is a function "disk_identify_packet_device_command" which tries a scsi SPC-4 ATA 16-bit pass-through command to identify a cd/dvd drive, and if that fails, tries an SPC-3 version of the command. I believe it is the version 3 attempt which causes the oops; commenting out line 270 of extras/ata_id/ata_id.c:

253 ret = ioctl(fd, SG_IO, &io_v4);
254 if (ret != 0) {
255 /* could be that the driver doesn't do version 4, try version 3 */
256 if (errno == EINVAL) {
257 struct sg_io_hdr io_hdr;
258
259 memset(&io_hdr, 0, sizeof(struct sg_io_hdr));
260 io_hdr.interface_id = 'S';
261 io_hdr.cmdp = (unsigned char*) cdb;
262 io_hdr.cmd_len = sizeof (cdb);
263 io_hdr.dxferp = buf;
264 io_hdr.dxfer_len = buf_len;
265 io_hdr.sbp = sense;
266 io_hdr.mx_sb_len = sizeof (sense);
267 io_hdr.dxfer_direction = SG_DXFER_FROM_DEV;
268 io_hdr.timeout = COMMAND_TIMEOUT_MSEC;
269
270 // ret = ioctl(fd, SG_IO, &io_hdr);
271 if (ret != 0)
272 goto out;
273 } else {
274 goto out;
275 }
276 }

appears to eliminate the panic.

Also, running the "sg_sat_identify" command from the sg3_utils package (http://sg.danny.cz/sg/sg3_utils.html#mozTocId479511), eg,
sg_sat_identify -p /dev/dvd

works, while running it as
sg_sat_identify -p -c /dev/dvd

frequently produces a kernel panic which looks the same as the udevd one at bootup. The difference is the -c switch which instructs the kernel to write back ATA register data in the sense buffer. The udev-165 code also does this (setting the ck_cond bit and hence, the oops).

UPDATE: Further testing shows that the ck_cond value is not relevant-- the panic results regardless of how ck_cond is set.

Last edited by resonance; 01-06-2011 at 02:40 PM. Reason: correction/update
 
2 members found this post helpful.
Old 01-05-2011, 02:38 PM   #37
Old_Fogie
Senior Member
 
Registered: Mar 2006
Distribution: SLACKWARE 4TW! =D
Posts: 1,519

Original Poster
Rep: Reputation: 62
As a test, on one of my test boxes, I put udev-164-i486-3 back in the system, but kept the /etc/rc.d/rc.udev from udev-165 (as that apparently creates the /dev/root properly according to the changelog).

I have yet to notice any issues with the downgrade, and haven't had the boot time kernel oops yet after a bunch of halt, reboot, suspend or hibernate.
 
Old 01-05-2011, 05:26 PM   #38
resonance
LQ Newbie
 
Registered: Dec 2010
Posts: 4

Rep: Reputation: 6
I just installed linux-2.6.37, and re-installed a vanilla udev-165. So far things look good, so (hopefully) the issue has been resolved in the kernel. The problem may be related to the fix (http://www.kernel.org/pub/linux/kern...geLog-2.6.37):

commit 2a5f07b5ec098edc69e05fdd2f35d3fbb1235723
Author: Tejun Heo <tj@kernel.org>
Date: Mon Nov 1 11:39:19 2010 +0100

libata: fix NULL sdev dereference race in atapi_qc_complete()

SCSI commands may be issued between __scsi_add_device() and dev->sdev
assignment, so it's unsafe for ata_qc_complete() to dereference
dev->sdev->locked without checking whether it's NULL or not. Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
 
Old 01-05-2011, 05:33 PM   #39
andrew.46
Senior Member
 
Registered: Oct 2007
Distribution: Slackware
Posts: 1,024

Rep: Reputation: 256Reputation: 256Reputation: 256
I am still quietly putting up with the issue which sees a failed boot one in every 10 or so boots. Any word on an 'official' fix from the slackware team? Otherwise I suspect I shall simply downgrade the udev package.....
 
Old 01-05-2011, 09:59 PM   #40
rworkman
Slackware Contributor
 
Registered: Oct 2004
Location: Tuscaloosa, Alabama (USA)
Distribution: Slackware
Posts: 2,377

Rep: Reputation: 936Reputation: 936Reputation: 936Reputation: 936Reputation: 936Reputation: 936Reputation: 936Reputation: 936
That patch is in 2.6.35.10; has anyone reproduced the problem with that kernel, perchance?

New kernels in -current are still a little ways out, I think - some other stuff probably needs to hit the tree first.
 
Old 01-06-2011, 07:36 AM   #41
andrew.46
Senior Member
 
Registered: Oct 2007
Distribution: Slackware
Posts: 1,024

Rep: Reputation: 256Reputation: 256Reputation: 256
Quote:
Originally Posted by rworkman View Post
That patch is in 2.6.35.10; has anyone reproduced the problem with that kernel, perchance?
I am new to -current and keen to get involved so:

Code:
andrew@skamandros~$ uname -r
2.6.35.10-ads
Used PV's config and just build the filesystem in; I shall watch for the familiar problem over the next couple of days .
 
Old 01-06-2011, 02:34 PM   #42
resonance
LQ Newbie
 
Registered: Dec 2010
Posts: 4

Rep: Reputation: 6
Follow-up to post #38:
I jumped the gun... The problem remains in kernel linux-2.6.37.

Follow-up to post #36:
The problem is probably in the kernel block, drivers/scsi/sg, or drivers/scsi/sd code, specifically related to "ATA pass-through" functionality, and probably only occurs for certain drive hardware. I don't know anything about this code, so until someone who does can fix it, using udev-164 (which doesn't use the ATA pass-through command on cd/dvd devices in ata_id.c), or commenting out this command in ata_id.c for udev-165, will side-step the issue for me.

Additional experimentation:
Another possible cause of this oops could be inappropriate buffer alignment. I built udev-165 with ata_id.c patched to use page-aligned sense and response buffers (rather than simple unsigned char arrays), and so far it looks promising- no panics yet (see attached patch).
Attached Files
File Type: txt 02-udev-165-ata-pass-through-memalign.patch.txt (1.3 KB, 23 views)

Last edited by resonance; 01-06-2011 at 02:41 PM. Reason: Correction
 
Old 01-06-2011, 03:45 PM   #43
hitest
Guru
 
Registered: Mar 2004
Location: Prince Rupert, B.C., Canada
Distribution: Slackware
Posts: 5,867

Rep: Reputation: 2036Reputation: 2036Reputation: 2036Reputation: 2036Reputation: 2036Reputation: 2036Reputation: 2036Reputation: 2036Reputation: 2036Reputation: 2036Reputation: 2036
Quote:
Originally Posted by resonance View Post
Follow-up to post #38:
I jumped the gun... The problem remains in kernel linux-2.6.37.

Follow-up to post #36:
The problem is probably in the kernel block, drivers/scsi/sg, or drivers/scsi/sd code, specifically related to "ATA pass-through" functionality, and probably only occurs for certain drive hardware.
I agree with this observation. I have an old PIII 850 MHz 32 bit -current box that has no issues with booting at all. But, my two newer intel dual core boxes were crashing with 32 bit -current. I've moved my newer boxes to Slackware64-current for the time being.
 
Old 01-06-2011, 09:58 PM   #44
aaazen
Member
 
Registered: Dec 2009
Posts: 357

Rep: Reputation: Disabled
Update Fri Jan 7:

2.6.36.10 and udev-165 crashed the same way this morning...
----------------------------------------------------
Thurs Jan 6:

Quote:
Originally Posted by andrew.46 View Post

Used PV's config and just build the filesystem in; I shall watch for the familiar problem over the next couple of days .
I did the same thing using the current huge smp kernel config to build a 2.6.35.10 kernel.

And I now use 2.6.35.10 and udev-165 along with the rest of Slackware current.

My set up is a new Intel dual core D510MO motherboard using an sata drive hooked up to the motherboard disk controller.

Last edited by aaazen; 01-07-2011 at 10:02 AM. Reason: 2.6.36.10 and udev-165 crashed this morning
 
Old 01-06-2011, 10:53 PM   #45
chytraeus
Member
 
Registered: Dec 2008
Distribution: slackware64 openbsd
Posts: 105

Rep: Reputation: 11
Quote:
Originally Posted by rworkman View Post
That patch is in 2.6.35.10; has anyone reproduced the problem with that kernel, perchance?

New kernels in -current are still a little ways out, I think - some other stuff probably needs to hit the tree first.
I have been running 2.6.35.10 long-term for about two weeks now. I had two kernel panics in a row last night. I have had kernel panics a couple of other times as well. I don't know how to catch the ouptut from the panic. It looks similar to what I've read in this thread. I must say I'm relieved it's not just my system.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Slackware-current Kernel OOPS fl0 Slackware 10 05-06-2010 01:17 AM
bad-timed question about current multios Slackware 1 05-21-2009 03:10 AM
X server randomly crashes since update to Current intens Slackware 8 06-16-2007 05:18 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 10:06 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration