LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   disk fails as part of array, but i can reassemble it (then it gets lost again) (https://www.linuxquestions.org/questions/slackware-14/disk-fails-as-part-of-array-but-i-can-reassemble-it-then-it-gets-lost-again-902573/)

nass 09-12-2011 08:34 AM

disk fails as part of array, but i can reassemble it (then it gets lost again)
 
hello everyone,
I seem to have lost a disk from an array of mine.

mdstat shows
Quote:

root@samothraki:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md126 : active raid1 sdf1[1] sdd1[0]
732466418 blocks super 1.2 [2/2] [UU]

md127 : active raid1 sde1[1](F) sdc1[0]
1465038451 blocks super 1.2 [2/1] [U_]
and /var/log/messages.1 states that:

Quote:

... (more of the same)
...
Sep 8 05:39:57 samothraki kernel: sd 3:0:0:0: [sde] Unhandled error code
Sep 8 05:39:57 samothraki kernel: sd 3:0:0:0: [sde] Result: hostbyte=0x04 driv
erbyte=0x00
Sep 8 05:39:57 samothraki kernel: sd 3:0:0:0: [sde] CDB: cdb[0]=0x2a: 2a 00 9f
91 00 3f 00 04 00 00
Sep 8 05:39:57 samothraki kernel: md: md127: recovery done.
Sep 8 05:39:57 samothraki kernel: sd 3:0:0:0: [sde] Unhandled error code
Sep 8 05:39:57 samothraki kernel: sd 3:0:0:0: [sde] Result: hostbyte=0x04 driv
erbyte=0x00
Sep 8 05:39:57 samothraki kernel: sd 3:0:0:0: [sde] CDB: cdb[0]=0x2a: 2a 00 9f
91 04 3f 00 00 80 00
the disk doesn't respond to smartctl calls
smartctl -a -d ata /dev/sde
smartctl 5.40 2010-10-16 r3189 [x86_64-slackware-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.


yet I can manage to rebuild the array...

should i go ahead and replace the volume?

thank you very much for your help

Mark Pettit 09-12-2011 08:49 AM

Your disk is dying. There is not much point persisting with it if it becomes unreliable.

nass 09-12-2011 09:00 AM

arrghh.. its only 2-3 years old!!! how can it be?
i mean seriously, is it just the worst component or is there some other problem?
I have a 850W PSU and and 8x disks , and i have already replaced one of them...

Mark Pettit 09-12-2011 10:36 AM

Well - it may have been dropped at some stage. I work with servers with lots of disks. Where the high-end scsi stuff (10k rpm or 15k rpm) is expensive and very reliable, the lower end commodity sata stuff is just the opposite. We use high-capacity servers for online (ie disk-to-disk) backups of databases - for this we use large arrays of sata drives. And I can tell you, the more drives you have, the higher the chance of one failing. I suspect that that is simply the law of probability :-), but I'm not a maths/stats boffin. We buy extra drives and have them lying around waiting for fails - soon as it happens it's usually a hot-swap and we're off again. Disk goes into the bin! Your data is probably important to you - hence the 8 drive array. Chuck the drive.


All times are GMT -5. The time now is 12:40 AM.