LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 07-05-2010, 08:19 AM   #1
fortez
Member
 
Registered: May 2009
Posts: 35

Rep: Reputation: 0
Oom killer?


Hi
our servers have a strange behaviour.
We use hp servers with RHE 4 for tasks of simulation.

Some servers during elaboration kill tasks of simulation and many other processes, also some very important for server utilization such as syslogd e sshd, so server can be eneterd only by ILO port.
This behaviour have been present in we e nights, when tasks are more frequent.

I'have soon thought to a oom memory casa but i have found nothing in the logs that confirm this.

In fact:

- in /var/log/messages i have only:
Jun 30 04:08:43 serversym1 exiting on signal 15
- the sar output for that day is
Quote:
00:00:01 kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
03:10:01 9401296 7012768 42.72 70672 4065020 32764440 116 0.00 0
03:20:01 9399760 7014304 42.73 70672 4065020 32764440 116 0.00 0
03:30:01 9398736 7015328 42.74 70672 4065020 32764440 116 0.00 0
03:40:01 9397264 7016800 42.75 70672 4065020 32764440 116 0.00 0
03:50:01 9396176 7017888 42.76 70672 4065020 32764440 116 0.00 0
04:00:02 9393744 7020320 42.77 70672 4065020 32764440 116 0.00 0
Average: 9397829 7016235 42.75 70672 4065020 32764440 116 0.00 0
So:
- is really a oom memory case?
- How can i possibly confirm oom memory assumption ?

Thanks
 
Old 07-05-2010, 10:22 AM   #2
business_kid
Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware & Android
Posts: 6,179

Rep: Reputation: 527Reputation: 527Reputation: 527Reputation: 527Reputation: 527Reputation: 527
free -ms 5 > file

You should see swap increasing if oom is hit, then free itself might go. BTW, RHEL 4 is a bit long in the tooth these days, but of course you knew that.
 
Old 07-05-2010, 05:31 PM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,104

Rep: Reputation: 985Reputation: 985Reputation: 985Reputation: 985Reputation: 985Reputation: 985Reputation: 985Reputation: 985
sysstat will provide all that and a lot more.
If OOM_killer had been invoked there would be messages everywhere. I'd be suspecting some monitoring code checking for "vital signs" - seems loadavg is a popular one.
 
Old 07-06-2010, 07:48 AM   #4
fortez
Member
 
Registered: May 2009
Posts: 35

Original Poster
Rep: Reputation: 0
Thumbs up

you are all right.
I have seen logs with most attention and i have seen that out of memory cases are correctly logged every time, when they are present.
So it seems not to be a oom killer case.

So, why exiting on signal 15 in /var/log/messages and server not available?

I have not understood if you mean /proc/loadavg or loadAVG tool ...
I have also nagios on the serverS but it is killed before i see something so no clear info by nagios.
 
Old 07-06-2010, 05:15 PM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,104

Rep: Reputation: 985Reputation: 985Reputation: 985Reputation: 985Reputation: 985Reputation: 985Reputation: 985Reputation: 985
I was thinking of your simulation product. It may be trying to protect itself. I've seen this mentioned somewhere (on a 2.4 kernel from memory), but I can't find the reference at the moment.
 
Old 07-07-2010, 03:07 AM   #6
fortez
Member
 
Registered: May 2009
Posts: 35

Original Poster
Rep: Reputation: 0
This night other crashes, no solution until now

My only assurance is that crashes are caused by interation beetween this software and server but i have seen
same tasks on workstation with less cpu and ram than servers not cause crashes of pc
On workstation same realease of red hat than servers
i ' m really confused
 
Old 07-08-2010, 03:37 AM   #7
business_kid
Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware & Android
Posts: 6,179

Rep: Reputation: 527Reputation: 527Reputation: 527Reputation: 527Reputation: 527Reputation: 527
Well, if you're running your simulation in a terminal, I will tell you what oom looks like. I got it once compiling some fpga stuff a guy had written in a brain fart and had umpteen libraries linked. When it came to the final ld, it threw me

Out of swap space
process killed

I repeated with free -ms 5 running in a terminal, and watched that. Ram went, and swap was gobbled, then I got the lines above again. I did compile it by unloading everything else - X, etc, and just running the 2 bash terminals. It took 182 megs to link it, and I only only had 197 available between swap and ram, so I got there by just unloading other processes. That gave me a 2 Meg executable.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] [64-current] Just had a run-in with the OOM-Killer - what to do with this? GrapefruiTgirl Slackware 13 08-15-2009 12:05 PM
Help me understand why oom-killer kicks in Ralfredo Linux - Kernel 20 04-30-2009 05:53 PM
Is it OOM Killer - how to tell from sar? mohitanchlia Linux - General 12 04-29-2009 07:12 PM
oom-killer on RHEL5.2 jaiarunk_s Linux - Server 3 12-12-2008 07:54 PM
OOM-Killer woes Slim Backwater Slackware 2 07-25-2006 03:00 AM


All times are GMT -5. The time now is 09:22 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration