LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-08-2014, 04:43 AM   #1
norman566
LQ Newbie
 
Registered: Dec 2014
Posts: 2

Rep: Reputation: Disabled
Post How do i create shell script that monitors violating process?


Hi all,

I'm kind of new to Linux. I have a question on how do I create a script to monitor process that is using >90% CPU and:

1. if it runs for 10 minutes, notify me

2. if it runs for 20 minutes, send final warning to me

3. if it runs more than 25mins, then kill the process.

Really appreciate some input on the matter.

Thanks,
Norman
 
Old 12-08-2014, 08:00 AM   #2
rtmistler
Moderator
 
Registered: Mar 2011
Location: Sutton, MA. USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu
Posts: 4,087
Blog Entries: 10

Rep: Reputation: 1521Reputation: 1521Reputation: 1521Reputation: 1521Reputation: 1521Reputation: 1521Reputation: 1521Reputation: 1521Reputation: 1521Reputation: 1521Reputation: 1521
I've moved this last set of thoughts up to the top because I had some suggestions; however I realized that this should not be a chronic problem where you should be writing a script:


It's hitting me that this isn't a normal situation. What's the system use situation? Multiple users or single user? If it's just you, you should not be starting stuff which overuses system resources, plus you should be aware that something's astray, so you check, as you've likely done, see something bad, and so you kill it. Learn what you're doing wrong, and stop that.

If this is multiple users, then understood that things are bad because someone else is doing this. It may be dangerous to write such a script anyways. I.e. if you're the responsible person who needs to resolve these issues, then you do what you have been doing, researching a poor performing system and correcting it.

Alongside that I'd educate the users and make sure they understand that they shouldn't run processes which tie up all system resources. And further, they'd be a victim along with the other programmers; hence educating that user about bad practices also points out the obvious which is that they ought to be the one to correct the problem which they've created. Reason #1 is that their processes are the parents to this, and reason #2 is so that they know and learn not to do this again.

Final thinking is if this is a multiple user system and the users are random, academic or free access for some reason, then when that person logs out, their processes should be terminated. If they're attaching in, over-using system resources all the time, a thing to do is identify who it is and terminate their login priveledges until they learn better.

As far as actually writing a script. Number one is if you keep asking, likely someone will post it all directly for you out of their own ego; however they shouldn't. The intentions of LQ are that people should guide you with an answer such as that to help you learn.

First few steps are:
  1. Understand how "you'd" monitor this process, for instance you may use top(1), w(1), ps(1), or some other commands and note for top(1) you may want to look at the -b, -n, and -u flags to help zero in on things
  2. Select a scripting language to write your script in, learn how to write fundamental scripts, and make the effort to write a first script with the intentions to resolve your question.
  3. If you self solve, then please do resolve the thread and post your solution, if you get partway and need advice either on how to do better or accomplish a next step, then also post your attempt where people can offer further advice

My preference for a language is BASH.

The words I say about BASH all the time are that "Anything you can do on the command line, you can also do within a BASH script." Therefore my point being that if you can monitor a process which uses too much time or CPU, and kill it when you decide it's become too intrusive to the system, you can similarly do this with a BASH script.

I'm sure there are much, much better ways, but I would start with attempting to get a brief output from top(1), discern from that output, i.e. grep or other form of search whether or not one or more processes were too heavy in their resource uses and save this information somewhere. I'd then sleep the script for sometime, maybe a minute or 5 minutes. Re-check the status of the system and determine if the same process IDs were still offending, or getting worse, and retain that information with some form of count of offenses. Once the number of "per process ID" offenses reached a limit, or exceeded a threshold, I would then conclude that the offending process needed to be killed and then proceed to kill it.

Some things to be aware of are that "a" process may use 99% of the CPU briefly and that not be a problem. However if your script does a one check sweep, sees 99% CPU and declares a process as bad, it may kill that process against your original intentions. The script itself may be written poorly and be a system offender itself, thus causing it to detect and kill, itself. I recommend you understand what you will use to classify an offending process before you choose to cause corrective action, and understand which processes are there not as user programs but instead system processes and daemons which are there to support system services, because you don't wish to write this type of script to have it then start killing off system daemons which are supposed to be there. Granted no daemon or other system process is supposed to over-use system resources, however this goes back to making sure you are aware what constitutes poor and/or over use of system resources.
 
3 members found this post helpful.
Old 12-08-2014, 10:34 AM   #3
Toadbrooks
Member
 
Registered: Jul 2008
Distribution: Linux Mint 16
Posts: 75

Rep: Reputation: 8
Since rtmistler didn't write your script for you, but made suggestions, let me also suggest:

This kind of problem is made to order for environment variables. In effect, this allows you to save a value in your environment variable space, and use it on later program executions. I'll include some pseudocode:

Code:
1st detect the high usage process, perhaps using 'ps', sed or awk, and grepping for a value of 90 to 99%

2nd IF detected_value <> $PROBLEM_PROCESS {
        PROBLEM_PROCESS = detected_value
        PROBLEM_COUNT = 1
        }
    ELSE {
        PROBLEM_COUNT += 1
        IF PROBLEM_COUNT = 10 { Notify }
        IF PROBLEM_COUNT = 20 { Alert }
        IF PROBLEM_COUNT = 25 {
            kill -15 PROBLEM_PROCESS
            PROBLEM_COUNT = 0
            }
        }
sleep 60
The one glaring problem with the above pseudocode is that if two processes are running away, the script will shift the one it's looking at between them, resetting the count each time, and never kill either one.

I hope you find this useful.

Last edited by Toadbrooks; 12-08-2014 at 10:37 AM.
 
1 members found this post helpful.
Old 12-29-2014, 02:25 AM   #4
norman566
LQ Newbie
 
Registered: Dec 2014
Posts: 2

Original Poster
Rep: Reputation: Disabled
Hi guys,

Thanks a lot for the input. This is a small multiple users environment which shared the resources on the same server. I would like to monitor so that some rogue processes does not slow down the server, or in worse situation, brings down the whole server.

For a start I manage to grab the input from ps command and put the output to a file. Then I convert the cputime to seconds and act accordingly.

CODE:
Quote:
!/bin/bash
ps -eo user,pid,pcpu,cputime,args | grep -v root | awk '{if($3>90) print $0}' | awk -F'[: ]+' '/:/ {t=$6+60*($5+60*$4); print t,$0}' > /home/nahosman/textfile/current.log
cat /home/nahosman/textfile/current.log | while read SECOND USER PID PCPU CPUTIME; do

if (( 300 <= $SECOND && $SECOND <= 590 )); then
x=$(($SECOND/60))
echo -e "`date` | The process $PID by user $USER running $PCPU% CPU for $x minutes on `/bin/hostname` | cmdline : `cat /proc/$PID/cmdline`"


elif (( 600 <= $SECOND && $SECOND <= 660 )); then
x=$(($SECOND/60))
echo -e "`date` | The process $PID by user $USER running $PCPU% CPU for $x minutes on `/bin/hostname` has been killed"

fi
done
this is pretty simple ones but I just use it as a start.

Thanks.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Script question: create a shell script in kde to log in on a server with ssh c4719929 Linux - Newbie 1 01-31-2011 04:05 AM
Shell script for reading a particular process contineously from process table vinaykori Linux - General 2 05-29-2009 07:52 AM
Shell Script : Kill a running process when another process starts ashmew2 Linux - General 3 08-20-2008 04:47 AM
kill the process invoked from a shell script, when the script is killed kskkumar Linux - Software 8 05-23-2007 12:29 PM
How to create a bash script to automatically disown a process. jon_k Linux - Software 5 06-19-2005 06:53 AM


All times are GMT -5. The time now is 06:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration