Power Failure > Hard Disks Gone
Not a Slackware problem as such, but you guys are so knowledgable.
Overnight there was a power cut. Server and Desktop PC (both Slack 13.37) both off in the morning. Server booted fine. Desktop PC does the LILO thing and then starts booting, but when it gets to mounting drives fails after trying several times to identify boot disk file system. Booting into the BIOS maintenance screens tells me I have an IDE CD ROM drive on Slave 1 but no disks on Master 1 or Masters/Slaves 2 or 3. Booting with a Rescue CD allows me to run fdisk, cfdisk, parted etc. All say they can detect no disks. Not surprising I guess if the BIOS thinks it has no disks. What is surprising is that LILO and first 5-6 seconds worth of boot processes seem to work OK. I am guessing that some mechanism is used which bypasses the BIOS and reads first sectors directly from the hard drive? Is this more likely to be a failure on the motherboard disk controller rather than the disks (three would have to fail simultaneously)? Thanks for any insights. |
As you say, unusual for 3 disks to fail together, but you don't know the circumstances of the power cut if it happened overnight, and there could be more to it than you imagine.
Having said that, it's always worth resetting the computer bios. Hopefully your motherboard manual will tell you how to do that, but if push comes to shove, just remove the back-up battery from the motherboard and short out the contacts with something, obviously making sure it is unplugged from the mains of course. Put the battery back in, power up and see how it goes. If that doesn't work, then I would personally remove the drives and attach them to an external caddy to determine there fitness for use. |
Over and above @vdemuth's advice (which is spot on), you can always try another disk in the machine, with the intention of proving if the motherboard or pci controller is broken.
|
Quote:
|
First, does it have a surge protector ? If it doesn't the PSU may be damaged. If it does, maybe the CMOS battery is dead, or maybe something else broke coincidentally.
I always have a surge protector and on desktops a UPS as well. Remember to run fsck on partitions that refuse to mount. |
Even with a surge protector - sometimes the power supply can get hosed. Had it happen to me.
|
Quote:
|
OK, having taken the lid off I can provide some better information. There are two Hard Drives, not three, and one of them (on the IDE bus) works OK provided that the other one (SATA) is not connected. It seems to be the SATA drive which has suffered and this (of course) is my main boot drive and also has /home directories. On initial boot from CD (Slackware installation or various live CD rescue setups) a partition table is seen by tools like cfdisk and parted. It shows my main ext4 partition as the second partition on the disk (1st is an NTFS and 3rd is a linux swap), but as soon as I try and do anything with it (like mount it) I get read errors and subsequently all the disk utilities say there is no partition table - it is all unallocated space.
I do not have a spare SATA drive to plug in to see if the problem is indeed in the drive or in the motherboard controller. Since the partition table is initially visible (and is correctly identified as ext4) I feel the data must surely be there. fsck says : Quote:
|
You should be able to clone it with a utility like clonezilla if that is the case as it wont need to be mounted anyway. There are a great deal of rescue live linux distributions and suggesting anyone over another is purely a matter of personal preference. I would try a few out and see how things transpire.
|
Sounds like a broken disk. What does SMART say?
Code:
smartctl -a /dev/sda |
Well, it seems to be failing, so store an image of the drive before it dies, or carve data off it and onto another drive with testdisk or foremost.
|
Quote:
Get disk drive diagnostics either from the manufacturer or from a CD downloaded from bootdisk.com. Nothing can say more than the manufacturer's own disk diagnostic. Best is to not change anything until those facts are known. Even SMART information from the drive can be obtained. |
Quote:
|
Quote:
The 'controller' concept is a legacy of MFM technology drives. Once IDE was created, then what was once a controller was only reduced to a data interface. So what has failed? The controller on that disk drive? Or the actual data platter and interface electronics (ie heads or motor)? Manufacturer's diagnostics (discussed earlier) will say so much more. Diagnostics should be executed before trying to fix anything. |
The most common thing that fails is the motor, but it could be anything. I would like to see a link that better explains what you want to say, because I don't understand.
I have learned what I know from observation and some reading and some thinking. I can see on one of my mobo a chip (I assume this is the controller) that says JMicron on it, and in Linux I can see that without the jmicron module / driver I cannot use the HDDs connected to the color-coded SATA ports belonging to this controller. It also has an Intel controller built into the southbride, which I can see also on the mobo and is marked with Intel ICH9 or something similar, and it can also be seen in 'lspci' output. It also exhibits the same behavior. I can only deduce that the SATA drivers are needed for each controller. In fact it also depends on BIOS settings. I set AHCI in the BIOS because it has more features and I've noticed that the Linux ahci driver is more stable and better maintained than the individual SATA drivers and it will become the standard. I have also opened a HDD and seen its workings, it also has a controller chip on it, although it cannot be accessed from the OS (Linux) and does NOT need drivers. It is there to coordinate low level things like moving the heads to the right place. I have read that laptop HDDs are specially made to withstand more power cycles that desktop drives because of power-saving features, and because the drive motor is one of the first things to go especially on a laptop. Now I would like an explanation of what "computers" might be on the disk drive, because I haven't seen any. I would also like to know the difference between an interface and a controller and what that means in the real world. Either way, the best thing to do at the moment is to backup the data on that drive by imaging it or carving data from it and onto a working drive. If the drive is dying, it will become inaccessible soon, and the only thing to do then to recover any data is take it to a special facility with a clean room where they will remove the platters and mount them on a separate motor and then recover the data. This is quite expensive and would probably only be done if the data is worth that much money. Oh, well, there is another alternative that I think onebuck mentioned once. It is a home-made mini clean room out of plexiglass. You'd have to build it and build it well because a single speck of dust will ruin everything, and it would still cost money. |
All times are GMT -5. The time now is 05:34 AM. |