LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Nagios NRPE problem (https://www.linuxquestions.org/questions/linux-software-2/nagios-nrpe-problem-829910/)

babaqga 09-02-2010 05:10 AM

Nagios NRPE problem
 
Hi guys, I have the following script

Code:

#!/bin/bash
NRCORES=`mpstat  | grep CPU | head -n1 | awk '{print $6}' | tr -d "(" | bc`
PIDOFJAVA=`pidof java`
`top -bn2 -p $PIDOFJAVA > "/usr/local/nagios/libexec/TopOutput.txt"`
OUTPUT=`cat "/usr/local/nagios/libexec/TopOutput.txt" | awk '{print $3}' | grep "," | tr -d "total," | tr -d "\r" | (read;cat)`
NRCORES=`echo $NRCORES| bc`
ENDLOOP=`echo $(($NRCORES*2)) | bc`
NRCORES=$(($NRCORES+1))
CORES=`echo $OUTPUT | awk '{
for (i='$NRCORES';i<='$ENDLOOP';i++)
print $i
}'`
CORES=`echo $CORES | tr "\s" "\;" | tr -d "\u" | tr -d " " | tr -d "%"`
CORES="SERVICE OK | spread=$CORES"
echo $CORES
#echo "SERVICE OK | spread=0.0;0.2;0.3;0.4;0.0;0.0;0.6;0.0;"
exit 0

If I run the script as THE NAGIOS USER, but without using nrpe, i get the following output:
Code:

[nagios@ws230 libexec]$./cpu_utilization.sh
spread=0.0;0.0;0.0;0.0;0.3;0.0;0.3;0.0;
[nagios@ws230 libexec]$


However, if i run the same script, but with NRPE, like this, I get a totally different.
Code:

[nagios@ws230 libexec]$./check_nrpe -H localhost -c command
spread=
[nagios@ws230 libexec]$

This FRUSTRATES ME, as if I echo $OUTPUT with nrpe in the script,
I get, using NRPE
Code:

1.8%sy 0.0%sy
Without using NRPE:
Code:

0.6%us 1.2%us 1.2%us 1.2%us 0.8%us 0.6%us 0.7%us 0.6%us 0.0%us 0.0%us 0.0%us 0.3%us 0.0%us 0.0%us 0.0%us 0.0%us
I just don't know what to do since the both the script and the TopOutput are owned by the nagios user/group and have write, read privileges.

So, the top executes differently when called by the same user but once from NRPE and once from the script residing locally. This is strange, since the NRPE deamon uses the same user as I am logged with locally in bash and executes the same script.
You will notice that I am even using absolute paths, so nothing can go wrong. In addition, even if I run the script in nrpe, I can clearly see its output is totally different from the output it produces if called from the script itself. Any hints?


Help!


EDIT: I found what the problem is. TOP has different .toprc with different users. However, I am not sure how to edit the TOPRC for the nagios user so it is the same as the .toprc for the root user. They look the same to me. Is there a way to tell the top command which configuration file to use?

EDIT2:it seems to me this is a bug in the procps package. That is why I updated it to the latest version for my distroCentOS 5.5 and procps to the following
http://rpm.pbone.net/index.php3/stat...86_64.rpm.html

HOWEVER, it doesn't work even now and top, called from NRPE (with user nagios) uses its SHITTY defaults, while top, called normally (again, with user nagios) somehow FINDS /home/nagios/.toprc and uses it. I can't seem to grasp what its problem is. Is there something besides which user you are that can affect top's behavior?



I can't believe that nobody in this forum haven't seen the problem.

MensaWater 09-03-2010 02:37 PM

My first thought any time something works from command line but not from automation (e.g. init scripts or cron jobs) and Nagios checking is that it is because of environment inheritance.

When you login (or "su -" to the user nagios you get a full environment from things like /etc/profile, /etc/bashrc, $HOME/.bashrc (or $HOME/.profile) if running bash or ksh. However automation typically has few if any of the environment variables set by these but rather uses a canned set for that tool. I've not seen this with Nagios but then I've not tried to use a .toprc (or top itself) with Nagios or NRPE.

The solution for automation is to insure the requisite variables are in the script itself OR that you source the standard environment files mentioned above. (Usually it is cleaner to do the former.)

It might help to add:
Code:

HOME=<home directory of Nagios user>
export HOME

to the script file so it knows where to find $HOME/.toprc.

Also sometimes the issue is OTHER variables that mask the problem you are having. That is to say something else is failing (e.g. not finding a PATH to one of the commands you're using earlier in the script - this by the way is the most common problem I see with such automation called scripts.)

babaqga 09-04-2010 05:06 PM

I believe you are right. First thing I'm gonna do when I get to work.

babaqga 09-06-2010 02:41 AM

Quote:

Originally Posted by babaqga (Post 4088182)
I believe you are right. First thing I'm gonna do when I get to work.

Yep you were right. Closing the topic. Wasn't a bug with the procps.

Cheers.

MensaWater 09-07-2010 10:21 AM

Quote:

Originally Posted by babaqga (Post 4089246)
Yep you were right. Closing the topic. Wasn't a bug with the procps.

Cheers.

Glad to hear you got it fixed.

So did just adding HOME variable do it or was there something more you had to add?

babaqga 09-08-2010 04:08 AM

Yep, only adding the HOME and exporting it did the trick.
Keep in mind though that I have been referring to top using
Its absolute location e.g /usr/bin/top so if you ommit that on some cases
you might wanna also include the PATH variable as it is on your user on your
particular distro.


All times are GMT -5. The time now is 06:14 PM.