Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Running Slackware 2.2.19 with raid 5 on a Megaraid 300 card, three Seagate 9Gb SCSI drives and one hot swap.
Problems started yesterday when the server spontaneously rebooted itself every 20-40 minutes. Had to use fsck a couple of times to get /dev/sd3a partition right after the sudden reboots, but managed to recover the server each time. I changed UPS, power cable etc in between outages to eliminate those. Eventually around 4pm the server ceased to complete booting at all.
Where I am at:
Linux starts booting, goes past LILO, picks up the IDE CDROM drive okay, and then:
scsi0:scanning channel 1 for devices
At this point the server reboots itself.
I have tried:
- changing ram
- changed power cords, ups
- removing all unneccesary cards
- selecting an 2.2.16 kernel that was still on the server
- made a boot disk RAWRITE megaraid.s a:
- ran consistency check in Megaraid BIOS 100% okay
- all 3 drives are ONLine in megaraid bios condition optimal,
What concerns me is that the boot disk which I made from a Slackware 7 CD with 'RAWRITE megaraid.s a:' causes the server to reboot at exactly the same point.
Its sounds liek with the multiple kernels you've tried that it isn't a software issue.
Offhand, if you want to make uber-certain of the RAM: www.memtest86.org (coolest thing since microwave burritos) but I'm with you, doubt its that too...
SCSI voodoo sucks... but it sounds like the line isn't terminated and is causing the Raid card's BIOS to hiccup and reboot... Honestly, I had a kernel panic chase around the other day and finally decided to start from the ground up: changing my BIOS and SCSI card BIOS back to defaults and then went adding devices back in from there. Also, the worlds biggest longshot, but scsi cabling has been know to just flake out over time...
For 8.0 there's a raid.i also that probably has the right support for your card. What I had assumed from that starting post was that this behavior was rather sudden and that the machine had been running under slack 8.0 for quite a while? If not, and maybe even just for the heck of it, the 2.2.19 and 2.2.16 kernels are pretty ancient, Slack8.1 comes with 2.4.18 and a lot of re-working has gone into the RAID sections of the kernel over th past year and a half that 2.2.19 has been out. That cable guess really is a long shot in the dark too, but isn't that just about the last piece of hardware left to sort through?
Also, never underestimate the power of the motherboard's BIOS to have doinked things up.
Thanks again, its always hard to post *all* the details, so I prolly left some important stuff out. The server has been running 18 months now on Slackware 7. The reason I booted off a Slackware 8 disk is because my 7 disk went AWOL, and v8 disk was all I had on hand!
Will try the raid.i option in the morning, thanks.
BTW, I only touched Linux for the first time in February, I inherited this server with the job! The server only does DHCP, Squid and SMTP, and runs Apache and PHP4 for my own use. I have managed to set up critical services on other servers in the interim, so we are at least operational and connected!
Okay, put a couple of other drives in, created a new stack, and tried to load Windows XP Pro as a test. Tried three times, and the PC rebooted spontaneously each time, at a different point, with anywhere from 30 minutes to 10 minutes of the install process left. Tomorrow we changed the Megaraid card. Phew.
Got a new Megaraid card... same problem. Where next? Took the server offsite to where we had access to more parts, and work just in is that it seems that it is the mainboard of all things! Can't beleive it, will beleive it when the new board arrives Monday. In the mean time I have learned a lot I guess about Linux, server hardware, and abotu FreeBSD too which is what I just set up the interim server with. Sigh.
I hate chasing hardware problems too, when its such a mess as to what's up that you have no idea. A long time ago I started carrying around with me a copy of tomsrtbt, available at www.toms.net, a bootable linux distro on a floppy. A friend of mine with a lot more varied hardware to deal with (where Tom's limited nature just wouldn't cut it), went one step further and has a hard drive with a bloated no-X install of Slack 8 and the newest kernel on it to diagnose hardware issues. Just a thought, although it seems like a length to go to, it works really well.