LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Tools or script to monitor load average on server (https://www.linuxquestions.org/questions/linux-general-1/tools-or-script-to-monitor-load-average-on-server-764885/)

nazs 10-27-2009 06:05 PM

Tools or script to monitor load average on server
 
Hi All,
We have a server that is running RHEL4 that occasionally spikes in load average above 10 and we have no idea what is causing it. We would like to know if there are any free tools or a script that when the load average hits a certain point it will trigger the system to start logging the processes to see what is happening. Usually by the time we get logged into the system the load average is on its way down. If someone has a better idea please let me know.

Thanks you,
Nazs

vikas027 10-27-2009 08:05 PM

Quote:

Originally Posted by nazs (Post 3734625)
Hi All,
We have a server that is running RHEL4 that occasionally spikes in load average above 10 and we have no idea what is causing it. We would like to know if there are any free tools or a script that when the load average hits a certain point it will trigger the system to start logging the processes to see what is happening. Usually by the time we get logged into the system the load average is on its way down. If someone has a better idea please let me know.

Thanks you,
Nazs

Whenever, load average is more, try to troubleshoot its cause by.

top
vmstat
iostat

....

and see if you can find anything useful. Have you tried these commands btw ?

nazs 10-27-2009 08:37 PM

Thanks for your response. I have tried TOP. But load average was already coming down and I did not see a process that would be causing any trouble. I will give the other command a try. Would still like to know if there was a way to log when it gets above a certain number.

vikas027 10-27-2009 09:02 PM

Quote:

Originally Posted by nazs (Post 3734749)
Thanks for your response. I have tried TOP. But load average was already coming down and I did not see a process that would be causing any trouble. I will give the other command a try. Would still like to know if there was a way to log when it gets above a certain number.

Hey,

Put this script in background and see if you get something useful.

Code:

#!/bin/bash
exec &>/dev/null
load=`uptime |  awk '{print $10}' | awk -F"." '{print $1}'`;
if [ $load -gt 5 ]    # You can change this 5 to some other digit depending upon the CPUs you have
then
top -bn1 >> /tmp/logs
vmstat 1 5 >> /tmp/logs
iostat 1 5 >> /tmp/logs
mpstat 1 5 >> /tmp/logs
mv /tmp/logs /tmp/logs_`date +%d-%B-%y_%H:%M`
fi

Put this in a file, say vikas.sh
and run it background.
Code:

bash vikas.sh &
This will generate a file of /tmp/logs_then_time.

Hope this helps.

chrism01 10-27-2009 09:02 PM

You could try (in a script) top in batch mode, every eg 2 mins, and if it finds a high load, start other monitoring cmds.


All times are GMT -5. The time now is 07:33 AM.