Raid 5 issues
Hi all
I have been running a raid 5 server for nearly 1.5 years now and all seemed fine up to now. The RAID 5 was build with 4 drives of 1TB, leaving me 3 TB of disk. On top I run LVM. Now, after rebooting the drives are not showing up properly. I need to for it to assemble and then it start to rebuild root@sacorria:~$ mdadm -D /dev/md0 /dev/md0: Version : 1.2 Creation Time : Wed Feb 6 13:42:47 2013 Raid Level : raid5 Array Size : 2929890816 (2794.16 GiB 3000.21 GB) Used Dev Size : 976630272 (931.39 GiB 1000.07 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Apr 21 07:16:34 2015 State : active, degraded, recovering Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Rebuild Status : 1% complete Name : sacorria:0 (local to host sacorria) UUID : 3a451151:b4dd6c80:7a7c8668:16b13fa8 Events : 490718 Number Major Minor RaidDevice State 0 8 17 0 active sync /dev/sdb1 5 8 33 1 spare rebuilding /dev/sdc1 3 8 49 2 active sync /dev/sdd1 4 8 1 3 active sync /dev/sda1 Next, I see it move to setting the sda1 as faulty root@sacorria:~$ mdadm -D /dev/md0 /dev/md0: Version : 1.2 Creation Time : Wed Feb 6 13:42:47 2013 Raid Level : raid5 Array Size : 2929890816 (2794.16 GiB 3000.21 GB) Used Dev Size : 976630272 (931.39 GiB 1000.07 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Apr 21 07:19:08 2015 State : active, FAILED, recovering Active Devices : 2 Working Devices : 3 Failed Devices : 1 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Rebuild Status : 2% complete Name : sacorria:0 (local to host sacorria) UUID : 3a451151:b4dd6c80:7a7c8668:16b13fa8 Events : 490748 Number Major Minor RaidDevice State 0 8 17 0 active sync /dev/sdb1 5 8 33 1 spare rebuilding /dev/sdc1 3 8 49 2 active sync /dev/sdd1 4 8 1 3 faulty /dev/sda1 and next it becomes even better, I get root@sacorria:~$ mdadm -D /dev/md0 /dev/md0: Version : 1.2 Creation Time : Wed Feb 6 13:42:47 2013 Raid Level : raid5 Array Size : 2929890816 (2794.16 GiB 3000.21 GB) Used Dev Size : 976630272 (931.39 GiB 1000.07 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Apr 21 07:19:34 2015 State : active, FAILED Active Devices : 2 Working Devices : 3 Failed Devices : 1 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Name : sacorria:0 (local to host sacorria) UUID : 3a451151:b4dd6c80:7a7c8668:16b13fa8 Events : 490752 Number Major Minor RaidDevice State 0 8 17 0 active sync /dev/sdb1 1 0 0 1 removed 3 8 49 2 active sync /dev/sdd1 3 0 0 3 removed 4 8 1 - faulty /dev/sda1 5 8 33 - spare /dev/sdc1 /dev/sda is actually giving errors when I do a smartctl test, but sdc does not. How do I make the /dev/sdc active instead of spare? I can mount it, but I am worried that the array will not last a long time. - I have seem the recreate with assume-clean etc online, but am worried to perform this action. Have some of you done this succesfully? I would hate to lose 3TB of data. Kind regards Steve |
Can you post mdadm --examine and smartctl -a for all drives?
The big question is, when/why was sdc1 kicked from your array, and how outdated/useful is the data on it? With sdc1 out of date, your best option would be to ddrescue sda to a new disk and hope that there are not too many errors. With RAID you must test your disks regularly for damages (smartmontools, long selftests, etc.). If you notice errors during the rebuild, (because the rebuild is the only test you ever ran in ages), it's too late. Without tests, disk errors can go unnoticed for a very long time... |
Hi Frost
I gave up on the raid and am not on a ZFS based Raid 10 solution. Kind regards |
All times are GMT -5. The time now is 09:28 PM. |