LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-27-2017, 12:48 PM   #1
kaplan71
Member
 
Registered: Nov 2003
Posts: 809

Rep: Reputation: 39
check_folder_size problem on Nagios server


Hello

We are running Nagios 4.1.1 on a Ubuntu Linux 14.04 LTS 64-bit server, and the client system in question is running Red Hat Enterprise Linux 7.2 64-bit release.
The folders that are the targets of the monitoring are located on an xfs filesystem on top of a logical volume. I have installed the nrpe 2.12 client as well as the
check_folder_size.sh script on the client.

An excerpt from the nrpe.cfg file is shown below:

Quote:
command[check_folder_size25]=/usr/local/nrpe/libexec/check_folder_size.sh 150000 200000 /xfs4/dips1
command[check_folder_size26]=/usr/local/nrpe/libexec/check_folder_size.sh 150000 200000 /xfs4/dips2
command[check_folder_size27]=/usr/local/nrpe/libexec/check_folder_size.sh 150000 200000 /xfs4/dips3
command[check_folder_size28]=/usr/local/nrpe/libexec/check_folder_size.sh 150000 200000 /xfs4/dips4
command[check_folder_size29]=/usr/local/nrpe/libexec/check_folder_size.sh 150000 200000 /xfs4/dips5
When I run any of the above commands locally on the client as root, the script works without issue. One example is the following:

Quote:
# /usr/local/nrpe/libexec/check_folder_size.sh 150000 200000 /xfs4/dips1
179200 MB used!
The services.cfg file on the Nagios server has the several of the following entry:

Quote:
define service {
service_description Free Space /xfs4/dips1
check_command check_nrpe!check_folder_size25
host_name <client hostname>
check_period 24x7
notification_period 24x7
contact_groups linux-admins
event_handler_enabled 0
active_checks_enabled 1
passive_checks_enabled 0
notifications_enabled 1
check_freshness 0
freshness_threshold 86400
use generic-service
}
The web interface on the Nagios server has the following output on-screen:

Quote:
Free Space /xfs4/dips1 UNKNOWN 03-27-2017 13:04:14 0d 0h 34m 15s 3/3 The filesystem doesnt exist or there is a script error
Free Space /xfs4/dips2 UNKNOWN 03-27-2017 13:04:14 0d 0h 37m 38s 3/3 The filesystem doesnt exist or there is a script error
Free Space /xfs4/dips3 UNKNOWN 03-27-2017 13:04:14 0d 0h 35m 38s 3/3 The filesystem doesnt exist or there is a script error
Free Space /xfs4/dips4 UNKNOWN 03-27-2017 13:04:14 0d 0h 33m 38s 3/3 The filesystem doesnt exist or there is a script error
Free Space /xfs4/dips5 UNKNOWN 03-27-2017 13:04:14 0d 0h 31m 38s 3/3 The filesystem doesnt exist or there is a script error
We have another server running Red Hat Enterprise Linux 6.6 that also has the check_folder_size.sh script on it, and it is not having this problem. I made sure to copy over the script from that system to the one having the problem.

Does anyone have an idea as to why this is occurring, and how it can be corrected?

Thanks.
 
Old 03-27-2017, 03:39 PM   #2
MensaWater
LQ Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, CoreOS, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 7,831
Blog Entries: 15

Rep: Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668
What user is running nrpe on the client?

I'd found similar issues long ago and was maddened by the fact it was working when I ran command as root but not when I did the check from my Nagios server. It turned out the reason it was failing from Nagios server was I'd configured the nrpe client (via xinetd) to run as user "nagios". Running on the client as that user I had the same error as I'd seen from the Nagios master server.

Ultimately my issue turned out to be that the user "nagios" on the client didn't have permissions to read the filesystem at all but it was difficult to get to that point until I realized that was the user that was doing the check (i.e. NOT the root user.)

I posted a blog at the time:
http://www.linuxquestions.org/questi...ck_nrpe-36015/
 
1 members found this post helpful.
Old 03-28-2017, 01:28 PM   #3
kaplan71
Member
 
Registered: Nov 2003
Posts: 809

Original Poster
Rep: Reputation: 39
Hello --

Thank-you for your email. I tried recompiling the nrpe and nagios-plugins with user and group set to root. However, the nrpe client refused to start with user and group set to root. I then recompiled again with user and group set to nagios, and this time I changed the shell for nagios to bash, and I also made nagios, as far as the du command was concerned, a member of the sudo group. When I ran the following command interactively:

Quote:
# su - nagios /usr/local/nrpe/libexec/check_folder_size.sh 150000 200000 /xfs4/dips1
The command completed successfully. However, even with these changes, the same error messages continued to occur.

I'm not sure what else to try at this point.
 
Old 03-28-2017, 02:32 PM   #4
MensaWater
LQ Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, CoreOS, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 7,831
Blog Entries: 15

Rep: Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668Reputation: 1668
What does "ls -ld /xfs" and "ls -ld /xfs4/dips1" show?

My suggestion wouldn't really be to run nrpe as root but rather give permissions to /xfs/dips1 to allow user "nagios" to access it. You don't have to give write access.

If you're using ACLs or SELinux you might have to do other steps to allow permission.
 
Old 03-28-2017, 03:07 PM   #5
kaplan71
Member
 
Registered: Nov 2003
Posts: 809

Original Poster
Rep: Reputation: 39
Hello --

The ls -ld command on the directory had the following output:

Quote:
# ls -ld /xfs4/dips1/
drwxr-xr-x. 10 root root 4096 Mar 27 11:09 /xfs4/dips1/
I changed the permissions on the directory to the following:

Quote:
drwxrwxr-x. 4 root root 49 Mar 6 2010 dips1
There was no improvement when I restarted the nrpe client.

I checked the selinux setting, and it is set to enforcing. However, another system running RHEL 6.6 has a similar setting, and the script is working without issue there.
 
Old 03-29-2017, 01:03 PM   #6
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 1,765

Rep: Reputation: 797Reputation: 797Reputation: 797Reputation: 797Reputation: 797Reputation: 797Reputation: 797
Any nrpe errors in /var/log/messages ?
Run the following trace to see what it runs
Code:
strace -f -s 256 -e execve -p `pgrep -ox nrpe`

Last edited by MadeInGermany; 03-29-2017 at 01:09 PM.
 
Old 03-29-2017, 02:37 PM   #7
kaplan71
Member
 
Registered: Nov 2003
Posts: 809

Original Poster
Rep: Reputation: 39
Hello --

There were several entries in the messages file.

Quote:
nrpe[136775]: Unable to open config file '/etc/xinetd.d/nrpe.cfg' for reading
nrpe[136775]: Config file '/etc/xinetd.d/nrpe.cfg' contained errors, aborting...
The first had to do with my creating an nrpe file in the xinetd.d directory. The syntax that I used was the following:

Quote:
# default: on
# description: NRPE (Nagios Remote Plugin Executor)
service nrpe
{
flags = REUSE
type = UNLISTED
port = 5666
socket_type = stream
wait = no
user = nagios
group = nagios
server = /usr/local/nrpe/bin/nrpe
server_args = -c /usr/local/nrpe/etc/nrpe.cfg --inetd
log_on_failure += USERID
disable = no
}
When I changed the nrpe.cfg file entries for the nagios user and nagios group to root, the following entries appeared in the messages file:

Quote:
nrpe[136878]: Error: NRPE daemon cannot be run as user/group root!
I ran the strace command syntax listed in your posting, and what follows is an excerpt of the output:

Quote:
strace -f -s 256 -e execve -p `pgrep -ox nrpe`
Process 50752 attached
Process 102426 attached
Process 102427 attached
[pid 102426] +++ exited with 0 +++
[pid 50752] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=102426, si_status=0, si_utime=0, si_stime=0} ---
Process 102428 attached
Process 102429 attached
[pid 102429] execve("/bin/sh", ["sh", "-c", "/usr/local/nrpe/libexec/check_load -w 15,10,5 -c 30,25,20"], [/* 27 vars */]) = 0
[pid 102429] execve("/usr/local/nrpe/libexec/check_load", ["/usr/local/nrpe/libexec/check_load", "-w", "15,10,5", "-c", "30,25,20"], [/* 26 vars */]) = 0
Process 102430 attached
[pid 102430] execve("/usr/bin/uptime", ["/usr/bin/uptime"], [/* 1 var */]) = 0
[pid 102430] +++ exited with 0 +++
[pid 102429] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=102430, si_status=0, si_utime=0, si_stime=0} ---
[pid 102429] +++ exited with 0 +++
[pid 102428] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=102429, si_status=0, si_utime=0, si_stime=0} ---
[pid 102428] +++ exited with 0 +++
[pid 102427] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=102428, si_status=0, si_utime=0, si_stime=0} ---
[pid 102427] +++ exited with 0 +++
Process 102431 attached
Process 102432 attached
[pid 102431] +++ exited with 0 +++
[pid 50752] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=102431, si_status=0, si_utime=0, si_stime=0} ---
Process 102433 attached
Process 102434 attached
[pid 102433] +++ exited with 0 +++
[pid 50752] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=102433, si_status=0, si_utime=0, si_stime=0} ---
Process 102435 attached
Process 102436 attached
Process 102437 attached
Process 102438 attached
[pid 102435] +++ exited with 0 +++
 
Old 03-30-2017, 08:09 AM   #8
kaplan71
Member
 
Registered: Nov 2003
Posts: 809

Original Poster
Rep: Reputation: 39
Hello --

I checked the output of the strace command further, and the following entry was present:

Quote:
Process 131343 attached
[pid 131343] execve("/usr/bin/sudo", ["sudo", "/usr/bin/du", "-shL", "/xfs4/dips1"], [/* 26 vars */]) = 0
[pid 131343] +++ exited with 1 +++
I ran the above command interactively in the following manner:

Quote:
su - nagios /usr/bin/sudo /usr/bin/du -shL /xfs4/dips1
The output that I got was the following:

Quote:
su: failed to execute hL: No such file or directory
I checked the du man page on the system, and the -shL options are listed as available. Also, when I ran the command as root, it completed successfully.

Last edited by kaplan71; 03-30-2017 at 08:13 AM.
 
Old 03-30-2017, 09:55 AM   #9
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 1,765

Rep: Reputation: 797Reputation: 797Reputation: 797Reputation: 797Reputation: 797Reputation: 797Reputation: 797
The nagios user appears to run
Code:
sudo /usr/bin/du -shL /xfs4/dips1
Perhaps this is coded in the /usr/local/nrpe/libexec/check_folder_size.sh
Does the command work on the command line?
Ensure you do not have
Code:
Defaults    requiretty
in /etc/sudoers! Delete it, #comment it, or change value to !requiretty
 
Old 03-30-2017, 10:11 AM   #10
kaplan71
Member
 
Registered: Nov 2003
Posts: 809

Original Poster
Rep: Reputation: 39
Hello --

Your suggestion to change the setting in the sudoers file solved the problem. Thank-you for the help.

If I may ask: What does commenting out that setting do exactly?
 
Old 03-30-2017, 10:53 AM   #11
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 1,765

Rep: Reputation: 797Reputation: 797Reputation: 797Reputation: 797Reputation: 797Reputation: 797Reputation: 797
It restricts to run sudo interactively only, by checking the current terminal, like the tty command does.
This is by default activated in Centos/RedHat - not in SuSE.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Nagios 3.1.2 + RHEL 5.3 You don't have permission to access /nagios/ on this server psix Linux - Server 13 08-04-2015 02:25 AM
nagios server problem anil98433 Linux - Server 3 05-27-2014 08:37 AM
[SOLVED] Server synced to internal NTP/Nagios Server - Nagios Still Reports Timecheck Warning Led Zappa Linux - Newbie 6 09-09-2013 04:18 PM
i am unable to monitor or connect remote linux server uning nagios 3.3 on nagios linn nandunay Linux - Server 2 05-09-2012 04:56 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 05:16 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration