Monitor software RAID with Nagios
I recently installed Nagios, mostly to monitor my software RAID1 (md0) and make it send me an e-mail when a disk fails.
(I also installed Merlin and Ninja. But it didn't find localhost in the Ninja-web-interface, so I will use the Nagios-default-webinterface for the time being)
I understand that raid monitoring (check_raid?) is a plugin but I find no good guides to install it. It feels like they are either skipping steps that they assume that everyone has already done or the guides are broken.
Can someone guide me through getting raid monitoring in Nagios?
I'm running CentOS 5.5
I installed Nagios with this guide:
Merlin and Ninja with these:
It is a case of installing the plugin on your client raid machine and using nagios to call the check_linux_raid with nrpe. I assume you are using nrpe client on your client machines right?
You can also use the mdadm -m (monitor mode) to send you any email alerts when something is up??
I am not really familiar with ninja or merlin, but if they do the same or a similar thing to nrpe, then you wont need nrpe....
No I dont use nrpe. Not that I know of anyway.
I think I have to use sendmail or postfix to use -m, but I don't want that.
I am currently working on the problem in a different way:
I am working on a script that checks mdadm --detail /dev/md0 if "Failed Devices" is set anything other than "0". The script works when I run it manually, but when I put it in crontab it wont run. My guess would be that it is a problem with permissions.
I wouldn't know., unless I saw the error message.
|All times are GMT -5. The time now is 01:33 AM.|