Quote:
Originally Posted by jostmart
What in the messages indicates a surface error? I don't doubt that this is the case since i've had very many disks failing from the same manufacturer, in different machines. I'm just curious.
|
"Medium Error"
The medium in a fixed disk is the disk surface.
Quote:
Are the problems unrecoverable or can something (like the kernel )'tag' the bad sectors to avoid using them?
|
You can try re-adding the drive to the array, and the kernel may be able to map round the problem. If not (too many errors), it will drop out again. You can repeat this process until it works or you get tired.
You can also try using the drive as an individual unit (in a workstation for example). To initialize and test/remap run (for example):
mke2fs -j -m 0 -c -c /dev/sda1
The '-c -c' performs a read/write test during the initialization to identify and map out bad sectors.
Quote:
Another thing i'm wondering about is why there is several weeks between the partitions are unmounted from the RAID. Maybe this has something to do with bad sector marking? One of the partitions i've had trouble with has been running without problems for about 3 weeks now since it hapened.
|
If you mean that you re-added the drive and it dropped out again several weeks later, that's just a function of when the damaged area is encountered.
In a production environment, folks usually just swap the drive and return it to the manufacturer for a replacement (if it's still under warranty). Or give them to employees (after wiping them) to play with.
They still have a useful life (though with reduced capacity). If you can't map out the area (it's too big), you can allocate the damaged area to a partition that you don't use. I've gotten several additional years use out of "bad" drives. Most people don't consider it worth their time to play with.