How to monitor remote host using nagios
Hi ALL,
Yesterday i install nagios, after that i can successfully monitor my local machine disk usage and its services, but now i want to monitor remote host, but have no idea how to do this , i also google alot but not getting resource full explanation ... so please guys tell me how can i monitor the remote host using nagios.... Warm Regards Ashish Sood |
I suggest you actually read the documentation Nagios comes with and look at your current configuration. Try, experiment, then ask more specific questions. I'm not telling you to go RT(F)M but Nagios documentation is by no way substandard.
|
I just started messing around with this myself and while the documentation is definitely out there, it's not necessarily easy to understand or put together in one place. If you go to the Nagios website and try to use the Quickstart guides, the portion on monitoring remote hosts says
Quote:
That being said, Googling it does return useful results. I would suggest reading http://nagios.sourceforge.net/docs/nrpe/NRPE.pdf. That helped a ton (although it was still a lot of trial and error). Also, there is an official Nagios book that I went and bought after visiting this Geek Stuff article. That ought to get you started. |
Thanks for your reply, now i can sucessfully monitor remote host ssh service , but still i m facing a problem when i try to monitor remote host user and disk storage i got error message configuration validation failed message after restarting nagios service, but not problem when i monitor remote ssh service
[Below is my remote.cfg file] define host{ use linux-server host_name station10 alias station10.ducat.com address 192.168.10.11 } define service{ use generic-service host_name station10 service_description SSH check_command check_ssh notifications_enabled 0 } define service{ use generic-service host_name station10 service_description check_mem check_command check_users notifications_enabled 0 } [below is the output of nagios configuration check when check it got this error message] Nagios Core 3.2.3 Copyright (c) 2009-2010 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad Last Modified: 10-03-2010 License: GPL Website: http://www.nagios.org Reading configuration data... Read main config file okay... Processing object config file '/etc/nagios/objects/commands.cfg'... Processing object config file '/etc/nagios/objects/contacts.cfg'... Processing object config file '/etc/nagios/objects/timeperiods.cfg'... Processing object config file '/etc/nagios/objects/templates.cfg'... Processing object config file '/etc/nagios/objects/remote.cfg'... Processing object config file '/etc/nagios/objects/localhost.cfg'... Read object config files okay... Running pre-flight check on configuration data... Checking services... Error: Service check command 'check_users' specified in service 'check_mem' for host 'station10' not defined anywhere! Checked 10 services. Checking hosts... Checked 2 hosts. Checking host groups... Checked 1 host groups. Checking service groups... Checked 0 service groups. Checking contacts... Checked 1 contacts. Checking contact groups... Checked 1 contact groups. Checking service escalations... Checked 0 service escalations. Checking service dependencies... Checked 0 service dependencies. Checking host escalations... Checked 0 host escalations. Checking host dependencies... Checked 0 host dependencies. Checking commands... Checked 24 commands. Checking time periods... Checked 5 time periods. Checking for circular paths between hosts... Checking for circular host and service dependencies... Checking global event handlers... Checking obsessive compulsive processor commands... Checking misc settings... Total Warnings: 0 Total Errors: 1 ***> One or more problems was encountered while running the pre-flight check... Check your configuration file(s) to ensure that they contain valid directives and data defintions. If you are upgrading from a previous version of Nagios, you should be aware that some variables/definitions may have been removed or modified in this version. Make sure to read the HTML documentation regarding the config files, as well as the 'Whats New' section to find out what has changed. [/B] i know there problem in my check_users command but this command plugins is install in the path of /usr/lib/nagios/plugins/, please tell me guys how to fix it. Warm Regards Ashish Sood |
You need to look into the NRPE plugin to monitor Remote Linux Hosts.
|
Yes i install nagios-nrpe and nagios-plugins rpm on both remote and server machine , and all the plugins is install on /usr/lib/nagios/plugins directory
|
Here is the step by step guide to get NRPE to work with Nagios
NRPE-Guide read between the lines to get the bigger picture on how it works... also use the test in it, to verify that the services are running and communicating properly |
I want to tell you that i install nagios , nrpe , nagios-plgins using yum (mean RPM not tar balls), and my plugins directory is
/usr/lib/nagios/plugins here all the plugins is install , i there is not check_nrpe plugins here, and on the other side i can able to monitor remote ssh server, but cant be able to monitor remote disk usage space. i mention above my remote.cfg file and also mention errors when i try to compile my nagios.cfg file please take a look above |
As for being able to monitor the ssh service, you can because that is not using NRPE - its just checking that the port is alive with a port scan.
If your check_nrpe is not there install the TAR ball. Use the source, the check_nrpe will be there. You can not trust an RPM that was configured by someone else and put in a repo. If you installed NRPE in another location specify that when you ./configure the plugins. That link I gave shows you exactly HOW TO configure NRPE for client and host. If you are still having issues I recommend the README file. |
thanks for your replay zerosignal, i sucessfully install check_nrpe on my system using yum nagios-plugins-nrpe now i can reach to my remote system
#/usr/lib/nagios/plugins/check_nrpe -H 192.168.10.11 NRPE v2.12 192.168.10.11 is my remote system still unable to monitor remote disk i also added the check_nrpe & check_disk plugins to my command.cfg file there is no problem when i restart my nagios service But i monitor to my remote system using browser it got NRPE: Command 'check_disk' not defined Below i will mentions my remote.cfg and command.cg file [remote.cfg] define host{ use linux-server host_name station10 alias station10.ducat.com address 192.168.10.11 } define service{ use generic-service host_name station10 service_description SSH check_command check_ssh notifications_enabled 0 } define service{ use generic-service host_name station10 service_description Root Partition check_command check_nrpe!check_disk } [command.cfg] ############################################################################### # COMMANDS.CFG - SAMPLE COMMAND DEFINITIONS FOR NAGIOS 3.2.3 # # Last Modified: 05-31-2007 # # NOTES: This config file provides you with some example command definitions # that you can reference in host, service, and contact definitions. # # You don't need to keep commands in a separate file from your other # object definitions. This has been done just to make things easier to # understand. # ############################################################################### ################################################################################ # # SAMPLE NOTIFICATION COMMANDS # # These are some example notification commands. They may or may not work on # your system without modification. As an example, some systems will require # you to use "/usr/bin/mailx" instead of "/usr/bin/mail" in the commands below. # ################################################################################ # 'notify-host-by-email' command definition define command{ command_name notify-host-by-email command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$ } # 'notify-service-by-email' command definition define command{ command_name notify-service-by-email command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$ } ################################################################################ # # SAMPLE HOST CHECK COMMANDS # ################################################################################ # This command checks to see if a host is "alive" by pinging it # The check must result in a 100% packet loss or 5 second (5000ms) round trip # average time to produce a critical error. # Note: Five ICMP echo packets are sent (determined by the '-p 5' argument) # 'check-host-alive' command definition define command{ command_name check-host-alive command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5 } ################################################################################ # # SAMPLE SERVICE CHECK COMMANDS # # These are some example service check commands. They may or may not work on # your system, as they must be modified for your plugins. See the HTML # documentation on the plugins for examples of how to configure command definitions. # # NOTE: The following 'check_local_...' functions are designed to monitor # various metrics on the host that Nagios is running on (i.e. this one). ################################################################################ #check_nrpe command definition define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 30 -c $ARG1$ } #check Remote disk partions define command{ command_name check_disk command_line $USER1$/check_disk -H $HOSTADDRESS -w $ARG1$ -c $ARG2$ -p $ARG3$ } # 'check_local_disk' command definition define command{ command_name check_local_disk command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ } # 'check_local_load' command definition define command{ command_name check_local_load command_line $USER1$/check_load -w $ARG1$ -c $ARG2$ } # 'check_local_procs' command definition define command{ command_name check_local_procs command_line $USER1$/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$ } # 'check_local_users' command definition define command{ command_name check_local_users command_line $USER1$/check_users -w $ARG1$ -c $ARG2$ } # 'check_local_swap' command definition define command{ command_name check_local_swap command_line $USER1$/check_swap -w $ARG1$ -c $ARG2$ } # 'check_local_mrtgtraf' command definition define command{ command_name check_local_mrtgtraf command_line $USER1$/check_mrtgtraf -F $ARG1$ -a $ARG2$ -w $ARG3$ -c $ARG4$ -e $ARG5$ } ################################################################################ # NOTE: The following 'check_...' commands are used to monitor services on # both local and remote hosts. ################################################################################ # 'check_ftp' command definition define command{ command_name check_ftp command_line $USER1$/check_ftp -H $HOSTADDRESS$ $ARG1$ } # 'check_hpjd' command definition define command{ command_name check_hpjd command_line $USER1$/check_hpjd -H $HOSTADDRESS$ $ARG1$ } # 'check_snmp' command definition define command{ command_name check_snmp command_line $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$ } # 'check_http' command definition define command{ command_name check_http command_line $USER1$/check_http -I $HOSTADDRESS$ $ARG1$ } # 'check_ssh' command definition define command{ command_name check_ssh command_line $USER1$/check_ssh $ARG1$ $HOSTADDRESS$ } # 'check_dhcp' command definition define command{ command_name check_dhcp command_line $USER1$/check_dhcp $ARG1$ } # 'check_ping' command definition define command{ command_name check_ping command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5 } # 'check_pop' command definition define command{ command_name check_pop command_line $USER1$/check_pop -H $HOSTADDRESS$ $ARG1$ } # 'check_imap' command definition define command{ command_name check_imap command_line $USER1$/check_imap -H $HOSTADDRESS$ $ARG1$ } # 'check_smtp' command definition define command{ command_name check_smtp command_line $USER1$/check_smtp -H $HOSTADDRESS$ $ARG1$ } # 'check_tcp' command definition define command{ command_name check_tcp command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$ } # 'check_udp' command definition define command{ command_name check_udp command_line $USER1$/check_udp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$ } # 'check_nt' command definition define command{ command_name check_nt command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$ } ################################################################################ # # SAMPLE PERFORMANCE DATA COMMANDS # # These are sample performance data commands that can be used to send performance # data output to two text files (one for hosts, another for services). If you # plan on simply writing performance data out to a file, consider using the # host_perfdata_file and service_perfdata_file options in the main config file. # ################################################################################ # 'process-host-perfdata' command definition define command{ command_name process-host-perfdata command_line /usr/bin/printf "%b" "$LASTHOSTCHECK$\t$HOSTNAME$\t$HOSTSTATE$\t$HOSTATTEMPT$\t$HOSTSTATETYPE$\t$HOSTEXECUTIONTIME$\t$HOS TOUTPUT$\t$HOSTPERFDATA$\n" >> /var/nagios/host-perfdata.out } # 'process-service-perfdata' command definition define command{ command_name process-service-perfdata command_line /usr/bin/printf "%b" "$LASTSERVICECHECK$\t$HOSTNAME$\t$SERVICEDESC$\t$SERVICESTATE$\t$SERVICEATTEMPT$\t$SERVICESTATETYPE$ \t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$\n" >> /var/nagios/service-perfdata.out } |
paste your nrpe.cfg in this thread
|
[nrpe.cfg]
############################################################################# # Sample NRPE Config File # Written by: Ethan Galstad (nagios@nagios.org) # # Last Modified: 11-23-2007 # # NOTES: # This is a sample configuration file for the NRPE daemon. It needs to be # located on the remote host that is running the NRPE daemon, not the host # from which the check_nrpe client is being executed. ############################################################################# # LOG FACILITY # The syslog facility that should be used for logging purposes. log_facility=daemon # PID FILE # The name of the file in which the NRPE daemon should write it's process ID # number. The file is only written if the NRPE daemon is started by the root # user and is running in standalone mode. pid_file=/var/run/nrpe.pid # PORT NUMBER # Port number we should wait for connections on. # NOTE: This must be a non-priviledged port (i.e. > 1024). # NOTE: This option is ignored if NRPE is running under either inetd or xinetd server_port=5666 # SERVER ADDRESS # Address that nrpe should bind to in case there are more than one interface # and you do not want nrpe to bind on all interfaces. # NOTE: This option is ignored if NRPE is running under either inetd or xinetd #server_address=127.0.0.1 # NRPE USER # This determines the effective user that the NRPE daemon should run as. # You can either supply a username or a UID. # # NOTE: This option is ignored if NRPE is running under either inetd or xinetd nrpe_user=nagios # NRPE GROUP # This determines the effective group that the NRPE daemon should run as. # You can either supply a group name or a GID. # # NOTE: This option is ignored if NRPE is running under either inetd or xinetd nrpe_group=nagios # ALLOWED HOST ADDRESSES # This is an optional comma-delimited list of IP address or hostnames # that are allowed to talk to the NRPE daemon. # # Note: The daemon only does rudimentary checking of the client's IP # address. I would highly recommend adding entries in your /etc/hosts.allow # file to allow only the specified host to connect to the port # you are running this daemon on. # # NOTE: This option is ignored if NRPE is running under either inetd or xinetd allowed_hosts=127.0.0.1 # COMMAND ARGUMENT PROCESSING # This option determines whether or not the NRPE daemon will allow clients # to specify arguments to commands that are executed. This option only works # if the daemon was configured with the --enable-command-args configure script # option. # # *** ENABLING THIS OPTION IS A SECURITY RISK! *** # Read the SECURITY file for information on some of the security implications # of enabling this variable. # # Values: 0=do not allow arguments, 1=allow command arguments dont_blame_nrpe=0 # COMMAND PREFIX # This option allows you to prefix all commands with a user-defined string. # A space is automatically added between the specified prefix string and the # command line from the command definition. # # *** THIS EXAMPLE MAY POSE A POTENTIAL SECURITY RISK, SO USE WITH CAUTION! *** # Usage scenario: # Execute restricted commmands using sudo. For this to work, you need to add # the nagios user to your /etc/sudoers. An example entry for alllowing # execution of the plugins from might be: # # nagios ALL=(ALL) NOPASSWD: /usr/lib/nagios/plugins/ # # This lets the nagios user run all commands in that directory (and only them) # without asking for a password. If you do this, make sure you don't give # random users write access to that directory or its contents! # command_prefix=/usr/bin/sudo # DEBUGGING OPTION # This option determines whether or not debugging messages are logged to the # syslog facility. # Values: 0=debugging off, 1=debugging on debug=0 # COMMAND TIMEOUT # This specifies the maximum number of seconds that the NRPE daemon will # allow plugins to finish executing before killing them off. command_timeout=60 # CONNECTION TIMEOUT # This specifies the maximum number of seconds that the NRPE daemon will # wait for a connection to be established before exiting. This is sometimes # seen where a network problem stops the SSL being established even though # all network sessions are connected. This causes the nrpe daemons to # accumulate, eating system resources. Do not set this too low. connection_timeout=300 # WEEK RANDOM SEED OPTION # This directive allows you to use SSL even if your system does not have # a /dev/random or /dev/urandom (on purpose or because the necessary patches # were not applied). The random number generator will be seeded from a file # which is either a file pointed to by the environment valiable $RANDFILE # or $HOME/.rnd. If neither exists, the pseudo random number generator will # be initialized and a warning will be issued. # Values: 0=only seed from /dev/[u]random, 1=also seed from weak randomness #allow_weak_random_seed=1 # INCLUDE CONFIG FILE # This directive allows you to include definitions from an external config file. #include=<somefile.cfg> # INCLUDE CONFIG DIRECTORY # This directive allows you to include definitions from config files (with a # .cfg extension) in one or more directories (with recursion). #include_dir=<somedirectory> #include_dir=<someotherdirectory> # COMMAND DEFINITIONS # Command definitions that this daemon will run. Definitions # are in the following format: # # command[<command_name>]=<command_line> # # When the daemon receives a request to return the results of <command_name> # it will execute the command specified by the <command_line> argument. # # Unlike Nagios, the command line cannot contain macros - it must be # typed exactly as it should be executed. # # Note: Any plugins that are used in the command lines must reside # on the machine that this daemon is running on! The examples below # assume that you have plugins installed in a /usr/local/nagios/libexec # directory. Also note that you will have to modify the definitions below # to match the argument format the plugins expect. Remember, these are # examples only! # The following examples use hardcoded command arguments... command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10 command[check_load]=/usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20 command[check_sda2]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/sda2 command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 150 -c 200 # The following examples allow user-supplied arguments and can # only be used if the NRPE daemon was compiled with support for # command arguments *AND* the dont_blame_nrpe directive in this # config file is set to '1'. This poses a potential security risk, so # make sure you read the SECURITY file before doing this. #command[check_users]=/usr/lib/nagios/plugins/check_users -w $ARG1$ -c $ARG2$ #command[check_load]=/usr/lib/nagios/plugins/check_load -w $ARG1$ -c $ARG2$ #command[check_disk]=/usr/lib/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ #command[check_procs]=/usr/lib/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$ |
Quote:
|
Thanks For your reply
I uncomment command[check_disk]=/usr/lib/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ from nrpe.cfg file and its still show a error message NRPE: Command 'check_disk' not defined on a web browser . |
At this point, I don't know what you are missing. I am going to suggest you read the Nagios Documentation and follow that. Back trace everything you created, follow it step by step.. Maybe even actually compiling the source instead of using the RPM.
Seems most of the Documentation out there is for building form source. If you choose to use the RPM's then just follow the steps in the config and make sure its correct. Ill be honest I never had any luck setting it up with the RPM's mine was from source, and I don't have my documentation with me. So I recommend reading the Nagios documentation. http://nagios.sourceforge.net/docs/nrpe/NRPE.pdf |
All times are GMT -5. The time now is 04:55 AM. |