I have a server running Redhat 7.2 with software raid 1. I need to stop the RAID resync for now, and restart it later when the system is not quite as busy. What would be the best way to stop the resync?
I was thinking about rebooting the server and pulling one drive, but would there be any possible negitave side affects of doing this?
How about using raidhotremove? Could I just remove one drive from the array for now, then add it again later? If so, how would I know which drive to remove from the array?
how many disks is the raid-1 array composed of? If you had 3 of which one failed, I would raidstop /dev/mdX and edit /etc/raidtab and remove the failed disk from the array before restarting the array. Well, if you had two of which one failed, you cant restart the array with just one disk, as far as I know.
Take a look at /proc/mdstat. That should tell you what device has failed.
there are 2 total disks in the array. The problem isnt that one of the disks failed, the problem is that the load average from the build is too high and causing other services to stop. If at all possible, I would like to stop the resync for now, and start it later when it's not so busy.
raidstop doesnt work, it tells me the device is busy.
here is some more info on my setup:
Filesystem Size Used Avail Use% Mounted on
/dev/md0 69G 20G 46G 29% /
/dev/md1 45M 22M 21M 50% /boot
none 441M 0 440M 0% /dev/shm
how about /proc/mdstat?
I see that you have for each raid-1 array, you have one partition each from each drive. So if you had a disk failure, both partitions belonging to your drive should be resyncing at this point. Or did you have a set of bad sectors in one of those partitions that makes one array 'dirty'?
We really need to see your /proc/mdstat. And your ps -ax | head -20
As you may be able to see from the PS output, it is the raid rebuild processes ( these are kernel processes and the user doesnt have control over these ) that's keeping your array 'busy'. As far as I've heard, the only way you can STOP these processes is by making necessary configuration to avoid these from starting up again and rebooting.
And what kind of resource crunch are we talking about here?
the raid is finially done and the server load is back in check.
When the server originaly crashed, I found the only way to get it to boot was to use a non-SMP kernel (seperate issue). Well, the only one I had on the server at the time was the redhat 2.4.7 kernel, which the server was running on since 2:00 yesterday through 12:00 today. using that kernel, the resync process was only at 23% this morning when I checked. Thats about 20 hours, 1% rebuild per hour!
Sendmail wasnt working (NS lookup faliures) and FTP was timing out.
So I upgraded to the lates redhat kernel (uniprcossser) and the complete resync took only 4 hours.
for the heck of it, here's my /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hda2 hdd2
73730240 blocks [2/2] [UU]
md1 : active raid1 hda1 hdd1
48064 blocks [2/2] [UU]
unused devices: <none>
Finally, here it is.
The answer to your question lies in /proc/sys/dev/raid/speed_limit_max and
These are writable files that can communicate your RAID intentions to the kernel and by default, contain values 100000 and 100 respectively. These are the values in KB of I/O that may go to 'resyncing' activities, the default max value being the theoretical goal. So if you want the resync to stop/ slow down, these values may be reduced during peak hours, provided you are willing to undertake the risks ( redundancy provided by the array is reduced, if you choose not to bring the array to proper operating condition )
See man 4 md for more information.
you, my frriend, are awsome. I'm going to try this next time a RAID server is resyncing and post the results.
|All times are GMT -5. The time now is 01:46 AM.|