LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices



Reply
 
Search this Thread
Old 05-21-2012, 02:26 PM   #1
jrb328
LQ Newbie
 
Registered: May 2012
Posts: 5

Rep: Reputation: Disabled
Nagios/NRPE


Hello,

I am running a few self-made plugins using NRPE. It is configured to utilize xinetd to communicate with remote servers. The first few executions of my plugin return the output I expect, with continuous executions returning CHECK_NRPE: Socket timeout after 10 seconds.

I am attempting to use the localhost IP address (127.0.0.1) as the IP for several hosts, each of which would display performance data on Nagiosgraph with rrd files located in the corresponding hosts' rrddir. I can only execute the plugins by passing 127.0.0.1 as the -H argument value (as opposed to using the specific hostname). This is the method of execution which leads to intermittent failures.

Thanks
 
Old 05-21-2012, 04:49 PM   #2
ratotopi
Member
 
Registered: Dec 2011
Posts: 110

Rep: Reputation: 6
are all your server running nrpe also running nagios ?? if not you will have to allow your nrpe to accept at least one server IP that is running nagios.
 
Old 05-24-2012, 10:46 AM   #3
jrb328
LQ Newbie
 
Registered: May 2012
Posts: 5

Original Poster
Rep: Reputation: Disabled
Re: NRPE

Hey,

Thank you for your response. I am running Nagios 3.3.1 and NRPE 2.13 on my main server, which I will refer to as A1. I have modified nrpe.cfg to let A1 recognize itself as an allowed_host. Here is where my plugin differs from a typical case: The plugins I have written use a Perl module to retrieve device statistics from remote servers which do not use NRPE at all. Rather, I use A1 as a hub which locally executes the plugins and thus asks the remote server to send back it's statistics, which are then formatted and returned to Nagios.

Another detail worth noting is that since I am using Nagiosgraph and rrdtool to graphically represent my performance data in the web interface, I needed to create a host object for each device which A1 communicates with even though the -H flag of check_nrpe is always 127.0.0.1 (localhost) for these plugins to use the scripts on A1. I have set the IP addresses for each of these hosts as 127.0.0.1 since the services correspond to one of these hosts and the rrd files are thus written to the appropriate host's rrddir.

As I said before, I am using xinetd with NRPE. I have tried changing the definition of check_nrpe with the inclusion of the -t option in commands.cfg and have also changed the value of command_timeout in nrpe.cfg - each to no avail. Is this because -t and command_timeout are only processed if the NRPE daemon is used?

I have also debugged the plugins and checked the logs to find any clues. As expected, the xinetd entry for the attempted execution of the plugin contains nrpe signal=13 and the duration exceeds ten seconds. However, in successful tries the nrpe status=0 and the durations of successful tries are generally 20 seconds or less. This is what confused me, because my timeout settings are well over a minute but NRPE seems to perceive the threshold as about 20-30 seconds. Here are example log entries:


May 24 14:31:35 A1 xinetd[12471]: START: nrpe pid=28120 from=127.0.0.1
May 24 14:31:35 A1 xinetd[12471]: EXIT: nrpe status=0 pid=27640 duration=20(sec)
May 24 14:31:36 A1 xinetd[12471]: EXIT: nrpe status=0 pid=27662 duration=20(sec)
May 24 14:31:41 A1 xinetd[12471]: EXIT: nrpe status=0 pid=27709 duration=14(sec)
May 24 14:31:42 A1 xinetd[12471]: START: nrpe pid=28373 from=127.0.0.1
May 24 14:31:49 A1 xinetd[12471]: EXIT: nrpe status=0 pid=28373 duration=7(sec)
May 24 14:32:36 A1 xinetd[12471]: EXIT: nrpe signal=13 pid=28120 duration=61(sec)


This has to be an issue with NRPE not accepting my timeout values, right? If I attempt to execute a series of service checks manually, the first few may return valid output and perfdata but some of these checks return the socket timeout - leaving gaps in my graphs and discontinuity in my data. The output of the plugins themselves are as follows:


[root@A1 libexec]# ./check_nrpe -H 127.0.0.1 -c vperf_mhz_a; ./check_nrpe -H 127.0.0.1 -c vperf_disk_a; ./check_nrpe -H 127.0.0.1 -c vperf_pct_a; ./check_nrpe -H 127.0.0.1 -c vperf_sys_a

OK - 135 MHz; |mhz=102;109;95;173;144;127;126;167;148;167;
OK - 5 disk; |disk=2;2;22;2;2;2;2;5;9;3;
OK - 60%; |pct=46;40;74;61;54;54;71;63;71;75;
CHECK_NRPE: Socket timeout after 10 seconds.


Lastly, I want to note that I have set the correct file permissions for nagios.nagios to access them and the directories which are involved in the execution of these plugins. Thank you in advance for any help you can offer.

-Jeff
 
Old 05-30-2012, 09:57 AM   #4
jrb328
LQ Newbie
 
Registered: May 2012
Posts: 5

Original Poster
Rep: Reputation: Disabled
Solved

I figured out the issue here. The change to the timeout variable did take effect, but there was also a change to the nrpe command. In adding the -t 120 flag to the command, it changed the configuration so that in order for NRPE to use the correct timeout value which was set, the -t 120 flag must be in every command run manually.

In commands.cfg, check_nrpe was changed to:


define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 120 -c $ARG1$
}


If I run ./check_nrpe -c *command_name*, the timeout is defaulted to 10 seconds.

However, if I run ./check_nrpe -t 120 -c *command_name* the appropriate timeout value is used, which was set in nrpe.cfg as:

command_timeout=120



-Jeff
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
nagios configuration with NRPE aocferreira Linux - Networking 2 05-24-2011 09:45 AM
[SOLVED] Nagios NRPE problem babaqga Linux - Software 5 09-08-2010 05:08 AM
Installing Nagios agent nrpe, problems following nrpe install steps rfreiberger Linux - Newbie 3 04-19-2010 09:43 AM
Nagios--- NRPE lazylark Linux - Software 1 04-11-2007 02:57 PM
Nagios NRPE twantrd Linux - Software 1 10-20-2004 09:24 AM


All times are GMT -5. The time now is 03:53 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration