LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   nagios plugin - check_procs (https://www.linuxquestions.org/questions/linux-software-2/nagios-plugin-check_procs-783801/)

rutledgetome 01-21-2010 12:13 PM

nagios plugin - check_procs
 
I need to check if a backup process has hung. That is, the backups on the system being checked normally finish in under 2 hours, so I would like to find any backup process that is older than 2 hours.

The nagios plugin check_procs lists the option of elapsed time, but does not seem to work in my environment. I have compiled plugins version 1.4.12 on AIX 5.3.

Anyone have any suggestions?

EricTRA 01-22-2010 05:04 AM

Hello and welcome to LinuxQuestions,

What's the exact command you are running, i.e. with what parameters are you executing the check_procs plugin?

Kind regards,

Eric

rutledgetome 01-22-2010 05:39 AM

Here is the command:

/usr/local/nagios/libexec/check_procs -m ELAPSED 120 -C tsm_db2_backup

This does find a process (when it exists) but no matter how much elapsed time this process has, this command returns

ELAPSED OK: 1 process with the command name "tsm_db2_backup"

Note: I have tried to check commands like init and sshd, but still, no success.

In doing this follow up, I see that check_procs version 1.4.12 for AIX does not have the option metric=ELAPSED. (The Linux version does) And, AIX is where I need this option.

In the end, what I need is a way to find a runaway process. If you know of other checks for this, please share.

Thanks for taking a look at this problem. All suggestions are welcome.

EricTRA 01-22-2010 06:09 AM

Hello,

I noticed the 'tsm' in the command name you used so I imagine that you're checking up on Tivoli Storage Manager backups. Have a look at this script:
Check TSM on AIX. Maybe it'll give you better results.
From the same site there's also this one.

Kind regards,

Eric

rutledgetome 01-22-2010 08:08 AM

I have found a shell script that does what I need

http://exchange.nagios.org/directory...untime/details

check_proc_time.sh works under bash or ksh (change first line for proper shell), so I get the results I need under AIX.

check_proc_time.sh -p tsm_db2_backup -w 3600 -c 7200

gives a WARNING if tsm_db2_backup runs for 1 hour, and CRITICAL if it is running for 2 hours - and OK if process NOT running. Just what I needed.

Thank you all for taking a look at my query.

EricTRA 01-22-2010 09:51 AM

Hi,

That's great news. Glad to hear you found a solution and thanks for posting the link here. If you consider your problem/question solved please mark this thread as solved using the Thread tools.

Kind regards,

Eric


All times are GMT -5. The time now is 04:44 AM.