I've moved this last set of thoughts up to the top because I had some suggestions; however I realized that this should not be a chronic problem where you should be writing a script:
It's hitting me that this isn't a normal situation. What's the system use situation? Multiple users or single user? If it's just you, you should not be starting stuff which overuses system resources, plus you should be aware that something's astray, so you check, as you've likely done, see something bad, and so you kill it. Learn what you're doing wrong, and stop that.
If this is multiple users, then understood that things are bad because someone else is doing this. It may be dangerous to write such a script anyways. I.e. if you're the responsible person who needs to resolve these issues, then you do what you have been doing, researching a poor performing system and correcting it.
Alongside that I'd educate the users and make sure they understand that they shouldn't run processes which tie up all system resources. And further, they'd be a victim along with the other programmers; hence educating that user about bad practices also points out the obvious which is that they ought to be the one to correct the problem which they've created. Reason #1 is that their processes are the parents to this, and reason #2 is so that they know and learn not to do this again.
Final thinking is if this is a multiple user system and the users are random, academic or free access for some reason, then when that person logs out, their processes should be terminated. If they're attaching in, over-using system resources all the time, a thing to do is identify who it is and terminate their login priveledges until they learn better.
As far as actually writing a script. Number one is if you keep asking, likely someone will post it all directly for you out of their own ego; however they shouldn't. The intentions of LQ are that people should guide you with an answer such as that to help you learn.
First few steps are:
- Understand how "you'd" monitor this process, for instance you may use top(1), w(1), ps(1), or some other commands and note for top(1) you may want to look at the -b, -n, and -u flags to help zero in on things
- Select a scripting language to write your script in, learn how to write fundamental scripts, and make the effort to write a first script with the intentions to resolve your question.
- If you self solve, then please do resolve the thread and post your solution, if you get partway and need advice either on how to do better or accomplish a next step, then also post your attempt where people can offer further advice
My preference for a language is BASH.
The words I say about BASH all the time are that "Anything you can do on the command line, you can also do within a BASH script." Therefore my point being that if you can monitor a process which uses too much time or CPU, and kill it when you decide it's become too intrusive to the system, you can similarly do this with a BASH script.
I'm sure there are much, much better ways, but I would start with attempting to get a brief output from top(1), discern from that output, i.e. grep or other form of search whether or not one or more processes were too heavy in their resource uses and save this information somewhere. I'd then sleep the script for sometime, maybe a minute or 5 minutes. Re-check the status of the system and determine if the same process IDs were still offending, or getting worse, and retain that information with some form of count of offenses. Once the number of "per process ID" offenses reached a limit, or exceeded a threshold, I would then conclude that the offending process needed to be killed and then proceed to kill it.
Some things to be aware of are that "a" process may use 99% of the CPU briefly and that not be a problem. However if your script does a one check sweep, sees 99% CPU and declares a process as bad, it may kill that process against your original intentions. The script itself may be written poorly and be a system offender itself, thus causing it to detect and kill, itself. I recommend you understand what you will use to classify an offending process before you choose to cause corrective action, and understand which processes are there not as user programs but instead system processes and daemons which are there to support system services, because you don't wish to write this type of script to have it then start killing off system daemons which are supposed to be there. Granted no daemon or other system process is supposed to over-use system resources, however this goes back to making sure you are aware what constitutes poor and/or over use of system resources.