Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Running Slackware 2.2.19 with raid 5 on a Megaraid 300 card, three Seagate 9Gb SCSI drives and one hot swap.
Problems started yesterday when the server spontaneously rebooted itself every 20-40 minutes. Had to use fsck a couple of times to get /dev/sd3a partition right after the sudden reboots, but managed to recover the server each time. I changed UPS, power cable etc in between outages to eliminate those. Eventually around 4pm the server ceased to complete booting at all.
Where I am at:
Linux starts booting, goes past LILO, picks up the IDE CDROM drive okay, and then:
.
.
.
scsi:1host
scsi0:scanning channel 1 for devices
At this point the server reboots itself.
I have tried:
- changing ram
- changed power cords, ups
- removing all unneccesary cards
- selecting an 2.2.16 kernel that was still on the server
- made a boot disk RAWRITE megaraid.s a:
- ran consistency check in Megaraid BIOS 100% okay
- all 3 drives are ONLine in megaraid bios condition optimal,
What concerns me is that the boot disk which I made from a Slackware 7 CD with 'RAWRITE megaraid.s a:' causes the server to reboot at exactly the same point.
Its sounds liek with the multiple kernels you've tried that it isn't a software issue.
Offhand, if you want to make uber-certain of the RAM: www.memtest86.org (coolest thing since microwave burritos) but I'm with you, doubt its that too...
SCSI voodoo sucks... but it sounds like the line isn't terminated and is causing the Raid card's BIOS to hiccup and reboot... Honestly, I had a kernel panic chase around the other day and finally decided to start from the ground up: changing my BIOS and SCSI card BIOS back to defaults and then went adding devices back in from there. Also, the worlds biggest longshot, but scsi cabling has been know to just flake out over time...
Thanks for the reply, I will try the cable tomorrow. Just seems odd that I can do checks on the drives from the megaraid setup, you would think that that means all is well with the drives.
With the boot disk, I read later that megaraid.s supports the 4XX Megaraid cards, but does not mention the Megaraid 300. I wonder if the incorrect kernel would cause a reboot.
I also tried booting off a bootable slackware 8 CD but it did not detect the Megaraid card - I tried bare.i and scsi.s, but did nto detect the card in either case. But it also did not reboot :-)
Tomorrow we load up some other drives, set up a stack and try and load an os on those drives to see if we can fault the megaraid, hotswap or cabling.
For 8.0 there's a raid.i also that probably has the right support for your card. What I had assumed from that starting post was that this behavior was rather sudden and that the machine had been running under slack 8.0 for quite a while? If not, and maybe even just for the heck of it, the 2.2.19 and 2.2.16 kernels are pretty ancient, Slack8.1 comes with 2.4.18 and a lot of re-working has gone into the RAID sections of the kernel over th past year and a half that 2.2.19 has been out. That cable guess really is a long shot in the dark too, but isn't that just about the last piece of hardware left to sort through?
Also, never underestimate the power of the motherboard's BIOS to have doinked things up.
Thanks again, its always hard to post *all* the details, so I prolly left some important stuff out. The server has been running 18 months now on Slackware 7. The reason I booted off a Slackware 8 disk is because my 7 disk went AWOL, and v8 disk was all I had on hand!
Will try the raid.i option in the morning, thanks.
BTW, I only touched Linux for the first time in February, I inherited this server with the job! The server only does DHCP, Squid and SMTP, and runs Apache and PHP4 for my own use. I have managed to set up critical services on other servers in the interim, so we are at least operational and connected!
Sorry, it has been running 8, not 7 as I said earlier. Tried a rawrite raid.s a: boot disk again this morning, same behaviour, reboots when it scans the scsi channel
Okay, put a couple of other drives in, created a new stack, and tried to load Windows XP Pro as a test. Tried three times, and the PC rebooted spontaneously each time, at a different point, with anywhere from 30 minutes to 10 minutes of the install process left. Tomorrow we changed the Megaraid card. Phew.
Got a new Megaraid card... same problem. Where next? Took the server offsite to where we had access to more parts, and work just in is that it seems that it is the mainboard of all things! Can't beleive it, will beleive it when the new board arrives Monday. In the mean time I have learned a lot I guess about Linux, server hardware, and abotu FreeBSD too which is what I just set up the interim server with. Sigh.
I hate chasing hardware problems too, when its such a mess as to what's up that you have no idea. A long time ago I started carrying around with me a copy of tomsrtbt, available at www.toms.net, a bootable linux distro on a floppy. A friend of mine with a lot more varied hardware to deal with (where Tom's limited nature just wouldn't cut it), went one step further and has a hard drive with a bloated no-X install of Slack 8 and the newest kernel on it to diagnose hardware issues. Just a thought, although it seems like a length to go to, it works really well.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.