find history of a job killed by "kernel: Out of Memory: Killed process"
During a long/few times a calcul's server not responding.
Anyone can't logging Anyone can't execut command (top, ps, ...) After in the /var/log/messages, we can see : Sep 20 11:28:36 cluster-b kernel: Out of Memory: Killed process 16322 (Transp_image). Sep 20 11:28:40 cluster-b kernel: Out of Memory: Killed process 10121 (rech). Sep 20 11:28:41 cluster-b kernel: Out of Memory: Killed process 8045 (visu). Sep 20 11:28:45 cluster-b kernel: Out of Memory: Killed process 25647 (Transp_image). I wish know for each process killed : - the owner of the process - the memory usage of the process It is possible ?? a file containt this information ?? With this, I could bactrack the binary's problem (in fact i suppose that the kernel kill the biggest use's memory process) Thank you for your help and your idea ... ps : for the futur, i could run a script that log the activity but i supposs that it exist another solution. ps2 : "sar" don't run |
If you can't add more memory, a "no cost" option is to add more swap space. Might only delay the situation.
If you can't/won't run "sar", try running "top" in batch and writing it to a file - say every couple of minutes. Will at least give you a bit of history. |
Quote:
but for the process which have been killed this morning, does exist a history of this killed ?? in /proc/. /var/. ?? thank you |
Not of it's memory usage that I'm aware of.
|
All times are GMT -5. The time now is 09:17 PM. |