Oom killer?
Hi
our servers have a strange behaviour. We use hp servers with RHE 4 for tasks of simulation. Some servers during elaboration kill tasks of simulation and many other processes, also some very important for server utilization such as syslogd e sshd, so server can be eneterd only by ILO port. This behaviour have been present in we e nights, when tasks are more frequent. I'have soon thought to a oom memory casa but i have found nothing in the logs that confirm this. In fact: - in /var/log/messages i have only: Jun 30 04:08:43 serversym1 exiting on signal 15 - the sar output for that day is Quote:
- is really a oom memory case? - How can i possibly confirm oom memory assumption ? Thanks |
free -ms 5 > file
You should see swap increasing if oom is hit, then free itself might go. BTW, RHEL 4 is a bit long in the tooth these days, but of course you knew that. |
sysstat will provide all that and a lot more.
If OOM_killer had been invoked there would be messages everywhere. I'd be suspecting some monitoring code checking for "vital signs" - seems loadavg is a popular one. |
you are all right.
I have seen logs with most attention and i have seen that out of memory cases are correctly logged every time, when they are present. So it seems not to be a oom killer case. So, why exiting on signal 15 in /var/log/messages and server not available? I have not understood if you mean /proc/loadavg or loadAVG tool ... I have also nagios on the serverS but it is killed before i see something so no clear info by nagios. |
I was thinking of your simulation product. It may be trying to protect itself. I've seen this mentioned somewhere (on a 2.4 kernel from memory), but I can't find the reference at the moment.
|
This night other crashes, no solution until now
My only assurance is that crashes are caused by interation beetween this software and server but i have seen same tasks on workstation with less cpu and ram than servers not cause crashes of pc On workstation same realease of red hat than servers i ' m really confused |
Well, if you're running your simulation in a terminal, I will tell you what oom looks like. I got it once compiling some fpga stuff a guy had written in a brain fart and had umpteen libraries linked. When it came to the final ld, it threw me
Out of swap space process killed I repeated with free -ms 5 running in a terminal, and watched that. Ram went, and swap was gobbled, then I got the lines above again. I did compile it by unloading everything else - X, etc, and just running the 2 bash terminals. It took 182 megs to link it, and I only only had 197 available between swap and ram, so I got there by just unloading other processes. That gave me a 2 Meg executable. |
All times are GMT -5. The time now is 08:01 PM. |