RAID5 corrupt, but drives test fine?
I have a RAID5 software raid in Centos 5.8 that keeps going read-only. A disk dropped out of raid a few weeks back, but when I added it back in, everything worked fine... for a while. Now mdadm shows all the disks are happy and working fine. I pulled the drives (Seagates) and tested them using Seagate Diagnostics and both short and long drive tests come up clean, but the problem persists.
I have run fsck several times and it always finds errors, and fixes them. The array will then remount fine for a few hours and then problems again. They are also in a JBOD, but the other drives in the JBOD that are not RAID are working without issue. These are fairly new drives (less than 1 year). I'm not sure what else to test. Any better drive diagnostic software out there? Any help would be great! |
You might want to investigate the drive controller(s), connections and cables. If the drives themselves are alright it sounds like your issue is at a lower level causes writes to not get committed properly.
|
Well I've run fsck 4-5 times over the past week and everything seems to be working for now.
|
Have you checked what S.M.A.R.T. (Self-Monitoring, Analysis and Reporting Technology) says about the drives? I recently had a hard disk failure in my RAID array that I suspected for a while but nothing detected till finally my S.M.A.R.T. report gave me enough confidence to determine which drive was failing and replace it.
http://pkgs.org/centos-5-rhel-5/cent....i386.rpm.html http://sourceforge.net/projects/smartmontools/ http://sourceforge.net/projects/smartsuite/ http://sourceforge.net/projects/smartlinux/ Good luck! |
All times are GMT -5. The time now is 06:34 PM. |