Nagios - nrpe plugin configuration
Hi,
I have configured Nagios with nrpe plugin for checking the remote hosts. it is configured and working fine for common plugins, NAGIOS SERVER libexec $ /usr/local/nagios/libexec/check_nrpe -H remotehost.net NRPE v2.5.1 NAGIOS SERVER libexec $ /usr/local/nagios/libexec/check_nrpe -H remotehost.net -c check_load OK - load average: 0.01, 0.00, 0.00|load1=0.010;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0; I have writed down a plugin(in bash), Now, when i checked that plugin through Nagios server it gives me the old values, NAGIOS SERVER libexec $ /usr/local/nagios/libexec/check_nrpe -H remotehost.net -c alarm ALARM OK Errors = 0 Warns = 0 But when i run that plugin manually on the remotehost and try the same again on Nagios Server, It gives me the correct values, REMOTE libexec $ /usr/local/nagios/libexec/alarm.sh ALARM WARNING Errors = 5 Warns = 2 NAGIOS SERVER libexec $ /usr/local/nagios/libexec/check_nrpe -H remotehost.net -c alarm ALARM WARNING Errors = 5 Warns = 2 Please suggest, |
Nagios/NRPE base their settings on exit values of the command rather than literal text. That is to say if the shell script completes successfully it will have an exist status of 0 (successful) even though you echo the word "ALARM".
You have to build your script so it gives the appropriate exit status AND the text you want to see. Another gotcha is to be sure it only returns ONE line of text. Typically you go ahead and define the various statuses as variables then return them. A short example script from one of my servers: Code:
#!/bin/ksh So it is the "exit 2" (exit $CRITICAL_STATE) that would exit with a status of 2 which Nagios recognizes as "CRITICAL". (I could have name the variable FUNNY_STATE if I'd wanted - it is the value of the exit not the name that matters.) Similarly it is the "exit 0" (exit $OK_STATE) that would exit with a status of 0 which Nagios recognizes as "OK". (Again I could have named this variable something like ALL_IS_GOOD because it is the value rather than the name that is important.) In fact you don't have to create the variable names at all - you can just put "exit 0", "exit 1" or "exit 2" at the appropriate places. The variables are used mainly so when you look back at the script you'll know which area of it likely caused the state you're seeing in your Nagios web page. The default exit status for successful commands is 0 and for unsuccessful is 1. If you don't define them then the status you see is the status of the final command in the script. (This is shell scripting basic - not a Nagios thing per se.) So say you had a script that did something like: ls -l /billybob ls -l /suzybob ls -l /jimmybob If there were no /billybob or /suzybob directory on your system each of those commands would have an exit 1 (file not found). However if there IS a /jimmybob on your system the exit status of that command would be 0 (successful - and it would show the file). The status of the script that ran all three lines would be 0 because that was the last status it saw even though two thirds of the commands failed. So to make if you want it to fail if ANY of the directories does you'd have to do: if ! ls -l /billybob then echo it failed exit 1 fi if ! ls -l /suzybob then echo It failed exit 1 fi ls -l /jimmybob then echo It failed exit 1 fi echo it succeeded exit 0 Basically the "exit 1" says to fail with exit code 1 which means the script failed. It would never get to the next name in the list because you told it to exit at the point it failed. It would therefore only go to the "it succeeded" message if all 3 had succeeded and would then do the exit 0. (As noted above you wouldn't need the exit 0 as that would be the state of the last command anyway.) |
All times are GMT -5. The time now is 11:08 PM. |