Weird sudden problem with 160GB SATA HDD with ext2/3
Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Weird sudden problem with 160GB SATA HDD with ext2/3
Sorry if this has been asked before, I did a search before posting but couldn't find anything. Please post links to existing threads if I have missed them and save wasting everyone's time.
The situation is this:
I purchased a 160GB SATA hard drive about a month ago to replace a dying 80GB IDE which contained my /home directory.
I set it up as accessible in IDE mode (instead of RAID) in the BIOS, partitioned it, formatted it (ext3), no problems, used it for a month with no problems at all, multiple reboots, checking syslog, nothing to worry about it.
Two days ago, I booted up and noticed most of my files missing in my home directory and a directory with a weird name (weird characters which I thought were "illegal" to use in a directory name). I rebooted to find my home folder completely inaccessible.
I checked the data cable and power plugs on the drive, all secured.
Boot into Knoppix, thinking this brand new drive is dead and I'll try to rescue some of my important data. Knoppix can't access it, group descriptors corrupted, unable to find a valid ext3 filesystem.
After about 10 or more fsck's, I concluded something was wrong with the drive, as everytime I ran fsck, it found inodes were in the wrong spot and offered to relocate them. I said yes and it just kept running in a loop, offering to relocate and me confirming.
I ran a diagnostics program for the Seagate hard drive from UltimateBootCD, all of the several short tests came up as passed, S.M.A.R.T has not been tripped, full sector scan came up as passed.
Concluded nothing was physically wrong with the drive and that I had lost all of my data, I proceeded to repartition and reformat the drive. I tried Partition Magic 8, cfdisk, fdisk and QTParted from the Knoppix CD. They claimed the partitioning went smoothly.
Every attempt to format the drive came up with no problems. But I could never mount the partition (sda1) even directly afterwards "group descriptors corrupted". How could they become corruped immediately after a format?
Tried rebooting after deleting all partitions, rebooting after creating a partition, rebooting after formatting, still Knoppix wouldn't mount the partition. Repeated fscks came up as inodes in the wrong place and group descriptors corrupted.
It just seems as this hard drive no longer wants to cooperate. I've not tried reiser yet, I'm a bit of a fan of ext2 and ext3.
Is there anything I've missed that I could try? After two days, it really is very frustrating.
The most recent mount attempt has reported this in dmesg:
EXT3-fs error (device sda1): ext3_check_descriptors: Block bitmap for group 856 not in group (block 28016640)!
EXT3-fs: group descriptors corrupted!
First I recommend that you take the hard drive back to Seagate under the warrenty.
Otherwise, I think that you have some bad blocks where the partition table goes. Try running the program called badblocks against the drive and see what it tells you. See:
I last fixed a bad spot on a disk about 15 years ago. As I remember it hard drives have spare blocks at the end of the drive. There are repair programs that you can run which will detect bad blocks and assign some of the spare blocks to take the place of the bad blocks. These programs are device specific so you need to get such a program from Seagate. You may already have the "assign alternative blocks" program in the Seagate diagnostic tool you are using.
I had used SeaTools I think it was, on UltimateBootCD to perform the surface scan.
I just ran "badblocks -s /dev/sda" on the drive, which showed nothing.
This is the thing which is confusing me. In my opinion, the drive is showing all the signs of possible bad sectors (but why they didn't show up when I first installed it puzzles me), but all the tools I've used so far show nothing wrong. It's just when after fresh partitioning and formatting and I attempt to mount it, do I see the problems.
Do you have any other suggestions? I'm not against taking the drive back, in fact, I am keen to do so, but I feel I need to exhaust all other reasonable options to ensure the drive really is in bad shape. I will also try the most recent edition of Seagate's tools to see what it can offer me.
PS: I almost thought the ram drive I made in Linux before all this happened was the cause of it, but I guess it really couldn't have anything to do with it, ie software vs hardware.
The fact that the problem is intermittent points to a hardware problem. A bad block or sector can be intermittent. Another possible cause could be dirty heads. With dirty heads the problem would be intermittent both as to time and sector. Also a dirty head problem would tend to go from intermittent to solid much faster than a bad surface problem.
Actually, you should not have either problem on a month old drive. I suggest that you just replace the drive under warranty.
Yes, after trying the latest SeaTools package, as well as partition magic to check everything, I even thought of seeing what the Windows XP installer had to say and sure enough "disk may be damaged".
I'll be taking it back to the store today to get a replacement. I've never run into this problem before, so was unsure of how to deal with it. Thanks for your suggestions and advice.