LinuxQuestions.org
Go Job Hunting at the LQ Job Marketplace
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 07-30-2007, 09:21 AM   #1
aeby
Member
 
Registered: Mar 2007
Posts: 109

Rep: Reputation: 15
nagios + nrpe configuration


Hi,

I have installed nagios and it is running fine, but i tried and installed nrpe but it is giving me configuration erros, But if i try and run the commands manually it gives me the correct values for all the commands.
The hosts.cfg in which i have defines all the host specifications is giving me error when i try and start nagios.

Reading configuration data...

Error: Invalid max_attempts, check_interval, retry_interval, or notification_interval value for service 'CPU Load' on host 'remotehost'
Error: Could not register service (config file '/usr/local/nagios//etc/hosts.cfg', starting on line 196)

***> One or more problems was encountered while processing the config files...

Check your configuration file(s) to ensure that they contain valid
directives and data defintions. If you are upgrading from a previous
version of Nagios, you should be aware that some variables/definitions
may have been removed or modified in this version. Make sure to read
the HTML documentation regarding the config files, as well as the
'Whats New' section to find out what has changed.

I cannot make out what values i should provide with .

Regards
Aeby

Last edited by aeby; 07-30-2007 at 09:23 AM.
 
Old 08-03-2007, 08:52 AM   #2
MensaWater
Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 5,940
Blog Entries: 5

Rep: Reputation: 753Reputation: 753Reputation: 753Reputation: 753Reputation: 753Reputation: 753Reputation: 753
Look at your services.cfg for entries related to "remotehost" and find the one for "CPU Load". It is telling you that you haven't specified that correctly.

An example for one my Windows hosts (which runs nsclient rather than NRPE):
Code:
define service{
        use                             generic-service
        host_name                       ATMSPS03
        service_description             # CPULOAD
        contact_groups                  ms-admins, noc-op, sharepoint-admins
        check_command                   check_nt_cpuload!"5,80,90,30,50,70"
An example for one our Linux hosts (which DOES use NRPE):
Code:
define service{
        use                     generic-service
        host_name               ATLP1D01
        service_description     # CPU Load Averages
        contact_groups          ux-admins, db-pc1, noc-op
        check_period            pc1_linux
        check_command           check_nrpe!check_load
On the specified Linux host the nrpe config contains the following for the check_load called from the above Nagios services.cfg:
Code:
command[check_load]=/usr/local/nagios/libexec/check_load -w 13.6,10.2,6.8 -c 16.0,12.0,8.0
This basically checks load at 5, 10 and 15 minute intervals and the -w values are warning thresholds for those intervals where the -c are critical thresholds for those intervals.
 
Old 08-06-2007, 08:23 AM   #3
aeby
Member
 
Registered: Mar 2007
Posts: 109

Original Poster
Rep: Reputation: 15
Thanks that helped
 
Old 03-18-2008, 03:45 AM   #4
bhagya
LQ Newbie
 
Registered: Mar 2008
Posts: 7

Rep: Reputation: 0
NRPE configuration

it is urgent


error during installation of nrpe

the error is:



Reading configuration data...

Error: Invalid max_attempts, check_interval, retry_interval, or notification_interval value for service 'CPU Load' on host 'remotehost'
Error: Could not register service (config file '/usr/local/nagios/etc/services.cfg', starting on line 36)

***> One or more problems was encountered while processing the config files...

Check your configuration file(s) to ensure that they contain valid
directives and data defintions. If you are upgrading from a previous
version of Nagios, you should be aware that some variables/definitions
may have been removed or modified in this version. Make sure to read
the HTML documentation regarding the config files, as well as the
'Whats New' section to find out what has changed.





# Service definition
define service{ #Name of service template to use
use generic-service
host_name remotehost
service_description PING
is_volatile 0
check_period alltime
max_check_attempts 5
normal_check_interval 5
retry_check_interval 1
contact_groups nagios
notification_interval 960
notification_period alltime
notification_options w,u,c,r
check_command check_ping!100.0,20%!500.0,60%


active_checks_enabled 1 ;
passive_checks_enabled 1 ;
parallelize_check 1 ;
obsess_over_service 1 ;

obsess_over_service 1 ;
check_freshness 0 ;
notifications_enabled 1 ;
event_handler_enabled 1 ;
flap_detection_enabled 1 ;
failure_prediction_enabled 1 ;
process_perf_data 1 ;
retain_status_information 1 ;
retain_nonstatus_information 1 ;
register 0 ;
}

define service{
use generic-service
host_name remotehost
service_description CPU Load
check_command check_nrpe!check_load
}
 
Old 05-28-2012, 10:28 AM   #5
sburnay
LQ Newbie
 
Registered: Sep 2011
Location: Lisbon, Portugal
Distribution: Ubuntu, CentOS & SUSE
Posts: 26

Rep: Reputation: Disabled
Quote:
On the specified Linux host the nrpe config contains the following for the check_load called from the above Nagios services.cfg:
Code:
command[check_load]=/usr/local/nagios/libexec/check_load -w 13.6,10.2,6.8 -c 16.0,12.0,8.0
This basically checks load at 5, 10 and 15 minute intervals and the -w values are warning thresholds for those intervals where the -c are critical thresholds for those intervals.
Well I understand that check_load applyes to 5,10 and 15 minutes;

I just don't get those values that are usually set for the thresholds (-w 13.6,10.2,6.8 -c 16.0,12.0,8.0)

I never figured it out, but now I need to understand them and I can't.
 
Old 05-28-2012, 06:50 PM   #6
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,225

Rep: Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021Reputation: 2021
Seems clear to me; does this help http://nagiosplugins.org/man/check_load ?
If not, can you be more specific?
 
Old 05-29-2012, 11:52 AM   #7
MensaWater
Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 5,940
Blog Entries: 5

Rep: Reputation: 753Reputation: 753Reputation: 753Reputation: 753Reputation: 753Reputation: 753Reputation: 753
You really shouldn't post in 4 year old threads. Only the people that initially subscribed are likely to see it and then only if they are still members so you might not get a response. Instead you should open a new thread and add a link back to old threads if they are germane.

To answer your question:
Quote:
Well I understand that check_load applyes to 5,10 and 15 minutes;

I just don't get those values that are usually set for the thresholds (-w 13.6,10.2,6.8 -c 16.0,12.0,8.0)
-w = warning thresholds
-c = critical thresholds

The thresholds set are entirely arbitrary so what it is checking is:

If 5 minute load is greater than or equal to 13.6 and less than 16.0 Nagios shows a WARNING. If 5 minute load average goes to 16.0 or greater Nagios shows to CRITICAL. For anything below 13.6 is Nagios shows OK.

If 10 minute load is greater than or equal to 10.2 and less than 12.0 Nagios shows a WARNING. If 10 minute load average goes to 12.0 or greater Nagios shows to CRITICAL. For anything below 10.2 is Nagios shows OK.

If 15 minute load is greater than or equal to 6.8 and less than 8.0 Nagios shows a WARNING. If 10 minute load average goes to 8.0 or greater Nagios shows to CRITICAL. For anything below 6.8 is Nagios shows OK.

As can be seen it is quite possible (and usual even) that the 5 and 10 minute loads will be "OK" but the 15 minute is WARNING or CRITICAL (or that the 5 and 15 are OK and the 10 is WARNING or CRITICAL -- or that the 10 and 15 are OK and the 5 is WARNING or CRITICAL). In such a case Nagios will display the highest level of alert. Which is to say if one of the 3 load averages is WARNING and the other 2 are OK (or WARNING) then Nagios is going to show WARNING. On the other hand if 1 level is WARNING, 1 is OK and 1 is CRITICAL Nagios is going to show CRITICAL.

As to why the thresholds are set that way: As noted it is arbitrary. On the system the example was taken from it was seen that the load went quite high regularly without impacting operations so the thresholds were set to alert if the load went ABOVE the currently running loads. The reasoning for the higher thresholds in the 5 minute then lower in 10 then even lower in 15 is that it isn't unusual for a system to temporarily run a high load so we don't want it alerting just for momentary spikes but rather for real jumps. However at the 15 minutes level the threshold is lower because it means it isn't just a spike - it has been running at a high load for several minutes.

For systems on which you have no experience you might set your checks to WARN at 1.0 and go CRITICAL at 2.0. To do the 5,10,15 you could do relative weighting as I do above. Over time you can tweak to other values as you determine what is "normal" for the specific system you're monitoring.

We have some systems here that hit loads over 30 on a regular basis without seeming to be causing anyone issues. Linux loads seem to not be congruent with UNIX loads. Also be sure to take into account whether the load your checking is for multiple processors/cores. A load of 1.0 on a single core processor means the run queue is full and processing things as they arrive. A load of 2.0 on two single core processors (or one dual-core) would be the equivalent. The later version of load check has an averaging function so that you can try to standardize what you alert on.
 
  


Reply

Tags
nagios


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Nagios/nrpe: SSL Issues Killbot_5000 Linux - Security 18 09-21-2010 05:27 PM
Nagios--- NRPE lazylark Linux - Software 1 04-11-2007 01:57 PM
Nagios - nrpe plugin configuration nitin-saxena Linux - Software 1 10-27-2006 01:50 PM
Nagios/NRPE issue JF1980 Linux - Security 1 05-18-2006 02:59 PM
Nagios NRPE twantrd Linux - Software 1 10-20-2004 08:24 AM


All times are GMT -5. The time now is 12:12 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration