Nagios Event Handler issue
Hey Guys,
I have followed Nagios' document regarding event handlers to set one up: http://nagios.sourceforge.net/docs/1...thandlers.html I used the same paths and filenames and everything, the only difference being that the apache restart commmand in the script provided in that document, is an: ssh user@host -i <RSA Key to login directly> -c "sudo /etc/rc.d/init.d/httpd restart" Rather than the: "/etc/rc.d/init.d/httpd restart" I am able to test this script as user nagios on my nagios server, and it execs fine: "[root@monitor nagios]# su - nagios -c "/usr/local/nagios/libexec/eventhandlers/restart-httpd CRITICAL HARD" Restarting HTTP service... [root@monitor nagios]#" So nagios has the permissions to exec it. Here are my config options regarding the service I speak of: Commands.cnf: "define command{ command_name restart-httpd command_line /usr/local/nagios/libexec/eventhandlers/restart-httpd $SERVICESTATE$ $STATETYPE$ $SERVICEATTEMPT$ }" Nagios host file: "define service{ use local-service ; Name of service template to use host_name content3 service_description HTTP check_command check_http notifications_enabled 1 event_handler restart-httpd }" local-service: "define service{ name local-service ; The name of this service template use generic-service ; Inherit default values from the generic-service definition check_period 24x7 ; The service can be checked at any time of the day max_check_attempts 4 ; Re-check the service up to 4 times in order to determine its final (hard) state normal_check_interval 5 ; Check the service every 5 minutes under normal conditions retry_check_interval 1 ; Re-check the service every minute until a hard state can be determined contact_groups admins,mobile ; Notifications get sent out to everyone in the 'admins' group event_handler_enabled 1 notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events notification_interval 60 ; Re-notify about service problems every hour notification_period 24x7 ; Notifications can be sent out at any time register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE! }" With all of this set, I shutdown apache on the remote server, nagios detects it and still never execs the event handler. I even tail the logs and it goes straight to "Critical/Hard" state, then sends notifications without ever running the event handler. Event handler logging is in fact enabled in the global nagios.cnf. If anyone can point out what I missed, or where to start looking I would be very very happy! Thanks guys! Tim |
Quote:
|
Quote:
|
It means that it's a template and not an actual service/host definition. Looks like you have 2 templates - local-service and generic-service but you're using the local-service template. If the local-service template definition was set to 'register 1', then it's no longer a template and you have just defined it as a service/host. Kind of makes sense?
Anyhow, back to your problem....the obvious thing that I can see is your command definition syntax: Quote:
Code:
command_line /usr/local/nagios/libexec/eventhandlers/restart-httpd $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ |
All times are GMT -5. The time now is 03:49 PM. |