LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices

Reply
 
Search this Thread
Old 04-23-2009, 11:00 AM   #1
JimMorbid
LQ Newbie
 
Registered: Apr 2009
Posts: 6

Rep: Reputation: 0
Numerous Slackware machines kernel panic / unable to mount root


Hi

I've got about 10 of my Slackware servers that have all of a sudden started kernel panicking across numerous release versions (10.1-12.2). I haven't been out too see whats wrong as they are located nationally butwas wondering if anyone has had a similar problem?

Thanks,

JM
 
Old 04-23-2009, 11:25 AM   #2
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269
When does this happen ? My brother also reports random kernel panics on startup. Not too often but they happen and the reason is so far unknown. This is with slackware 12.1.
 
Old 04-23-2009, 11:48 AM   #3
JimMorbid
LQ Newbie
 
Registered: Apr 2009
Posts: 6

Original Poster
Rep: Reputation: 0
I have about 250 slack servers and like I said 10-15 of them all started doing this today. They were rebooted and then seem to hang on a panic of some sort. Unfortunately that is all I know (haven't been able to procure one for myself to experience first hand).

There is absolutely no obvious joining factor between them... some are old, some have a lot of packages installed, some Reiser, some EXT3. One of them I installed on brand new hardware yesterday and installed today at my clients with absolutely nothing installed save an openvpn and patches.

So very confused about this...
 
Old 04-23-2009, 12:41 PM   #4
metrofox
Member
 
Registered: Jan 2009
Location: Palermo, Italy
Distribution: Slackware
Posts: 236

Rep: Reputation: 37
Where do you get this kernel panic? I mean, which server gives you this error? What kernel is used on these 10 servers?
 
Old 04-23-2009, 02:24 PM   #5
+Alan Hicks+
Member
 
Registered: Feb 2005
Distribution: Slackware
Posts: 72

Rep: Reputation: 54
We're not gonna be able to troubleshoot this until you can give us some more information. The only thing I can think that's changed recently and might cause something like this would be udev, but that's assuming every machine that's experiencing this problem is using a 2.6 kernel. Since you said many of these machines are 10.x servers, they are not likely to be running a 2.6 kernel.
 
Old 04-23-2009, 02:30 PM   #6
JimMorbid
LQ Newbie
 
Registered: Apr 2009
Posts: 6

Original Poster
Rep: Reputation: 0
Thanks but I'm not really looking for a solution as of yet - was really just seeing if anyone else has had similar problems today/yesterday.

JM
 
Old 04-23-2009, 03:32 PM   #7
Alien Bob
Slackware Contributor
 
Registered: Sep 2005
Location: Eindhoven, The Netherlands
Distribution: Slackware
Posts: 5,259

Rep: Reputation: Disabled
You have 15 servers hanging at boot but are not looking for a solution. Yeah right.

Eric
 
Old 04-23-2009, 05:30 PM   #8
DavidHindman
Member
 
Registered: Dec 2008
Distribution: Slack 13 + JWM
Posts: 101

Rep: Reputation: 23
Quote:
Originally Posted by JimMorbid View Post
I have about 250 slack servers...
I doubt it's a specific issue with Slackware.

This is a wild guess, but 250 is suspiciously close to the magic number 256.

If you have 250 servers connecting to "something" over openvpn tunnels, the "something" may be running out of tcp sockets. That is, if the default of that something is 256 sockets, you hit the limit when you're trying to support 250 servers.

See this link for someone who has described a similar problem:

Link

We probably can't give a lot of help without more information about your setup.

Last edited by DavidHindman; 04-23-2009 at 05:42 PM. Reason: Fix link - yet again
 
Old 04-24-2009, 12:19 AM   #9
astrogeek
Senior Member
 
Registered: Oct 2008
Distribution: Slackware: 12.1, 13.1, 14.1, 64-14.1, -current, FreeBSD-10
Posts: 1,872

Rep: Reputation: 642Reputation: 642Reputation: 642Reputation: 642Reputation: 642Reputation: 642
Actually, there IS an obvious joinning factor...

Quote:
Originally Posted by JimMorbid View Post
There is absolutely no obvious joining factor between them...
Actually, there IS only one obvious joining factor between them - the fact that they are connected by YOU.

I do not mean by that, that you are responsible, but the common factor is probably some aspect of YOUR use of them.

Consequently, I expect DavidHindman is probably on the right track.
 
Old 04-24-2009, 02:00 AM   #10
JimMorbid
LQ Newbie
 
Registered: Apr 2009
Posts: 6

Original Poster
Rep: Reputation: 0
Okay well it seems that all the machines have Super block IO errors and hang on a kernel panic (at boot). I've tried reiserfsck --check, --rebuild-sb, the rebuild-tree and it finally fails with a bad root block...
 
Old 04-24-2009, 02:28 AM   #11
DavidHindman
Member
 
Registered: Dec 2008
Distribution: Slack 13 + JWM
Posts: 101

Rep: Reputation: 23
Well, without further details we're still just guessing. I understand that it's not practical for you to give more details if you don't know where to focus at this point. So I hope you don't mind if I make another guess.

Once I had a problem with occasional reiserfs disk volume corruption. It acted like the drive was not being unmounted properly at shutdown.

My home directory was mounted on a USB drive connected through a PCMCIA USB 2.0 interface. The problem turned out to be in /etc/rc.d/rc.6, which shut down the PCMCIA interface before the drive was unmounted. That caused the occasional reiserfs corruption.

Last edited by DavidHindman; 04-24-2009 at 02:31 AM.
 
Old 04-24-2009, 02:44 AM   #12
JimMorbid
LQ Newbie
 
Registered: Apr 2009
Posts: 6

Original Poster
Rep: Reputation: 0
Thanks all but at this point it looks like the partition table was deleted and (in its entirety) and a single partition written over it. I assume there is no recovering from that...
 
Old 04-24-2009, 02:52 AM   #13
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,280

Rep: Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028
Erk.
Normally I would say testdisk - for the situation where partitions have been deleted and another defined over the space. That is no big deal, and usually easily recoverable.
But sounds like you have (been) reformatted as well - as evidenced by the fact that you could run fsck against the (now) single partition.

Doesn't sound good.
Have you looked for intrusions ???.
 
Old 04-24-2009, 03:04 AM   #14
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,280

Rep: Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028
Maybe not - the underlying filesystem will still look valid if the old first partition and this new one start at the same point.
Try testdisk anyway, and see if it can find the original partitions.
 
Old 04-24-2009, 03:19 AM   #15
DavidHindman
Member
 
Registered: Dec 2008
Distribution: Slack 13 + JWM
Posts: 101

Rep: Reputation: 23
Yeah, it might be a really good idea to change passwords, ssh keys, etc.

Does this occur only when you are using reiserfs?

Did this problem start occurring suddenly at some point?

What is the interface between the servers and the corrupted disks? Internal, external, or is there some kind of network mount?

Does the interface to the root volume depend on something complicated like a vpn tunnel or a network connection? What happens if that dependency fails?

How many drives are connected to each server? I could imagine where you have hda, hdb, hdc and so forth. Then some disk detection fails at startup, and hdc becomes hda, etc. etc. When you look at what you think is hda, you're actually seeing hdc, etc.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Kernel Panic - Unable to mount root FS Halospree13 Linux - Newbie 5 01-21-2009 03:37 AM
Kernel Panic:Unable to mount Root fs ajay.talk Linux - Software 3 10-23-2007 12:41 AM
kernel panic : VFS : Unable to mount root fs on 03:03 (Slackware Distro) tolits Linux - General 13 01-20-2005 10:52 PM
Dual-booting Mandrake & Slackware: Kernel panic: VFS: Unable to mount root fs vasudevadas Linux - General 5 08-22-2004 04:43 PM


All times are GMT -5. The time now is 11:53 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration