LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-06-2008, 08:36 AM   #1
kenneho
Member
 
Registered: May 2003
Location: Oslo, Norway
Distribution: Ubuntu, Red Hat Enterprise Linux
Posts: 657

Rep: Reputation: 40
Understanding /var/log/messages memory output


One of my servers are having problems with memory starvation, and are whacking off (mostly Java) processes to stay alive.

In order to debug the memory starvation issue I need a more thorough insight into how the memory management is done in Linux. Therefore it would be great if someone could provide a walkthrough of the essentails in the /var/log/messages extract provided below.

For example:
  • How do the different memory zones work?
  • What does it mean that memory is "all unreclaimable"?

I'm not completely new to this stuff, but would like an active discussion in order to get a more complete understanding.

Anyways, here's the log extract:

Mar 6 11:53:22 mercury kernel: oom-killer: gfp_mask=0xd2
Mar 6 11:53:22 mercury kernel: Mem-info:
Mar 6 11:53:22 mercury kernel: DMA per-cpu:
Mar 6 11:53:22 mercury kernel: cpu 0 hot: low 2, high 6, batch 1
Mar 6 11:53:22 mercury kernel: cpu 0 cold: low 0, high 2, batch 1
Mar 6 11:53:22 mercury kernel: cpu 1 hot: low 2, high 6, batch 1
Mar 6 11:53:22 mercury kernel: cpu 1 cold: low 0, high 2, batch 1
Mar 6 11:53:22 mercury kernel: Normal per-cpu:
Mar 6 11:53:22 mercury kernel: cpu 0 hot: low 32, high 96, batch 16
Mar 6 11:53:22 mercury kernel: cpu 0 cold: low 0, high 32, batch 16
Mar 6 11:53:22 mercury kernel: cpu 1 hot: low 32, high 96, batch 16
Mar 6 11:53:22 mercury kernel: cpu 1 cold: low 0, high 32, batch 16
Mar 6 11:53:22 mercury kernel: HighMem per-cpu:
Mar 6 11:53:22 mercury kernel: cpu 0 hot: low 32, high 96, batch 16
Mar 6 11:53:22 mercury kernel: cpu 0 cold: low 0, high 32, batch 16
Mar 6 11:53:22 mercury kernel: cpu 1 hot: low 32, high 96, batch 16
Mar 6 11:53:22 mercury kernel: cpu 1 cold: low 0, high 32, batch 16
Mar 6 11:53:22 mercury kernel:
Mar 6 11:53:22 mercury kernel: Free pages: 269900kB (512kB HighMem)
Mar 6 11:53:23 mercury kernel: Active:299042 inactive:231443 dirty:0 writeback:0 unstable:0 free:67475 slab:4698 mapped:530206 pagetables:2270
Mar 6 11:53:23 mercury kernel: DMA free:12524kB min:16kB low:32kB high:48kB active:0kB inactive:0kB present:16384kB pages_scanned:4444 all_unreclaimable? yes
Mar 6 11:53:23 mercury kernel: protections[]: 0 116000 180000
Mar 6 11:53:23 mercury kernel: Normal free:256864kB min:928kB low:1856kB high:2784kB active:17224kB inactive:16292kB present:901120kB pages_scanned:2406355 all_unreclaimable? yes
Mar 6 11:53:23 mercury kernel: protections[]: 0 0 64000
Mar 6 11:53:23 mercury kernel: HighMem free:512kB min:512kB low:1024kB high:1536kB active:1178816kB inactive:909480kB present:4325376kB pages_scanned:7454098 all_unreclaimable? yes
Mar 6 11:53:23 mercury kernel: protections[]: 0 0 0
Mar 6 11:53:23 mercury kernel: DMA: 3*4kB 4*8kB 2*16kB 3*32kB 3*64kB 3*128kB 2*256kB 0*512kB 1*1024kB 1*2048kB 2*4096kB = 12524kB
Mar 6 11:53:23 mercury kernel: Normal: 146*4kB 25*8kB 17*16kB 6*32kB 2*64kB 2*128kB 5*256kB 10*512kB 1*1024kB 1*2048kB 60*4096kB = 256864kB
Mar 6 11:53:23 mercury kernel: HighMem: 20*4kB 6*8kB 0*16kB 0*32kB 0*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 512kB
Mar 6 11:53:23 mercury kernel: Swap cache: add 12442227, delete 12441204, find 4256249/5650043, race 5+249
Mar 6 11:53:23 mercury kernel: 0 bounce buffer pages
Mar 6 11:53:23 mercury kernel: Free swap: 0kB
Mar 6 11:53:23 mercury kernel: 1310720 pages of RAM
Mar 6 11:53:23 mercury kernel: 1015792 pages of HIGHMEM
Mar 6 11:53:23 mercury kernel: 77337 reserved pages
Mar 6 11:53:23 mercury kernel: 6895 pages shared
Mar 6 11:53:23 mercury kernel: 1023 pages swap cached
Mar 6 11:53:23 mercury kernel: Out of Memory: Killed process 18552 (java).



Regards,
kenneho

Last edited by kenneho; 03-06-2008 at 08:38 AM.
 
Old 03-07-2008, 07:39 AM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594
Quote:
Originally Posted by kenneho View Post
In order to debug the memory starvation issue I need a more thorough insight into how the memory management is done in Linux.
Since you used the word "thorough" you'll want to read a few docs before asking questions since these explain a lot:
- LinuxMMDocumentation (starting point),
- Understanding Virtual Memory (gentle intro),
- /usr/src/linux/Documentation/sysctl/vm.txt,
- Understanding the Linux Virtual Memory Manager,
- Understanding the Linux Kernel, chapter 8 "Memory management" (find yourself an online copy).
Don't mistake this for a RTFM answer, Linux VMM *is* interesting but not that easy to explain in a few sentences. At least I can't.


Quote:
Originally Posted by kenneho View Post
Mar 6 11:53:23 mercury kernel: Free swap: 0kB
So you ran out of swap. Gotta love Java apps. While Tuning Linux VM on Kernel 2.6 is about Oracle it does ask the question "How to diagnose VM problems?" The TS approach is generic so could help you too.
 
Old 03-07-2008, 09:26 AM   #3
kenneho
Member
 
Registered: May 2003
Location: Oslo, Norway
Distribution: Ubuntu, Red Hat Enterprise Linux
Posts: 657

Original Poster
Rep: Reputation: 40
Quote:
Originally Posted by unSpawn View Post
Since you used the word "thorough" you'll want to read a few docs before asking questions since these explain a lot:
- LinuxMMDocumentation (starting point),
- Understanding Virtual Memory (gentle intro),
- /usr/src/linux/Documentation/sysctl/vm.txt,
- Understanding the Linux Virtual Memory Manager,
- Understanding the Linux Kernel, chapter 8 "Memory management" (find yourself an online copy).
Don't mistake this for a RTFM answer, Linux VMM *is* interesting but not that easy to explain in a few sentences. At least I can't.



So you ran out of swap. Gotta love Java apps. While Tuning Linux VM on Kernel 2.6 is about Oracle it does ask the question "How to diagnose VM problems?" The TS approach is generic so could help you too.
Thanks. Let me study the documentation and get back to this thread with whatever questions I may have.

And thank you for the link regarding diagnosing VM problems.
 
Old 03-09-2008, 08:52 AM   #4
kenneho
Member
 
Registered: May 2003
Location: Oslo, Norway
Distribution: Ubuntu, Red Hat Enterprise Linux
Posts: 657

Original Poster
Rep: Reputation: 40
I'm still reading up on Linux memory management, but would like to post a question that I can't get my head around. So far I've not been able to find the answer to this, and I would be very thankful for help on resolving this:

The first post on this thread shows that the oom killer whacks off Java processes.

To document which processes are hogging the memory I made a simple script that outputs the top five memory consuming processes once the swap usage is close to full. This is an extract of the output:

%MEM PID SZ VSZ COMMAND
19.4 25152 1805468 1827488 /opt/ibm/WebSphere/ProcServer/java/bin/java (...)
7.0 24416 769824 787740 /opt/ibm/WebSphere/ProcServer/java/bin/java (...)
5.2 24845 716952 734868 /opt/ibm/WebSphere/ProcServer/java/bin/java (...)
6.2 18489 463560 481452 /opt/ibm/WebSphere/ProcServer/java/bin/java (...)
3.0 18666 407032 420840 /opt/ibm/WebSphere/ProcServer/java/bin/java (...)


These five processes are the five most memory consuming processes around the time the oom killer starts whacking processes. What puzzles me is that the memory usage of these five processes is merely around 40%. There are not enough remaining processes to fill up the remaining memory. So why does oom killer thinks I'm out of memory?

I should add that the servers is configured with 1200 MBs of swap space.
 
Old 03-10-2008, 06:30 AM   #5
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594
I don't have the answer to this. All I can offer for consideration are some things to look into.

Wrt the system: does HW/SW meet or exceed the specs (what are your specs?) Websphere suggests it needs? (And you should not view swap as something good: disk I/O is expensive.) Is this a production machine (that is, do you also have a staging box to test releases on)? When did the OOM situation start? What changed at that time? Does this happen with only this or other kernels as well? Is the system tuned to run for this task? Does the system run only this task, or are there other services running that could be expensive in other ways (CPU, disk I/O)? Have you gathered stats for plotting (Dstat, SAR)? Wrt Websphere: which product(s) are you running? Is it started with heap restrictions (-Xms -Xmx)? Did the OOM situation occur right from the start after installation or later on, suddenly (say after another application release)? Do WAS logs show errors before OOM is invoked? Wrt the application(s): has there been any major changes in code which also mark the start of OOM? Was there any profiling done? 'top' can help in some situations (like to find out if you've got much swap going on if RSS is a fraction of VSZ) but Java apps do memory management differently and what you want is to read up on Java and profiling. If you only have to deploy code and manage the server the developers should (be forced to :-] ) do the grunt work for that. Finally, since Websphere has a large community, did you check their bug tracker and community resources for clues?
 
Old 03-10-2008, 09:13 AM   #6
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594
// (Posted here so I don't fsck up your other thread's -reply status) ... if http://www.linuxquestions.org/questi...memory-626965/ is related to all of this you should have posted your specs there saying it's a Java-based app. Postponing launch until you or the developers get a grip on problems seems only common sense to me.
 
Old 03-10-2008, 09:46 AM   #7
kenneho
Member
 
Registered: May 2003
Location: Oslo, Norway
Distribution: Ubuntu, Red Hat Enterprise Linux
Posts: 657

Original Poster
Rep: Reputation: 40
Quote:
Originally Posted by unSpawn View Post
// (Posted here so I don't fsck up your other thread's -reply status) ... if http://www.linuxquestions.org/questi...memory-626965/ is related to all of this you should have posted your specs there saying it's a Java-based app. Postponing launch until you or the developers get a grip on problems seems only common sense to me.
The threads are not directly related. And I should add that the servers I'm referring to are merely servers used for development - they are not production servers.

Thank you for your useful thoughts in your previous post here on the thread. I've not yet had the time to carefully study the issues you address, but intend to do so asap.
 
Old 03-10-2008, 10:09 AM   #8
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594Reputation: 3594
Not production. Cool. Well, just post info as you go but if you can please try to work in the direction of HW -> OS -> SW -> application(s).
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
flaky adsl, can you explain this /var/log/messages output? pbhj Linux - Networking 2 02-01-2008 07:04 PM
Redirecting the kernel messages to file other than /var/log/messages jyotika_b83 Linux - General 3 04-28-2005 06:39 PM
No output to /var/log/messages or ~syslog eelriver Slackware 5 07-18-2004 05:13 AM
/var/log/messages full of these messages. Should I be concerned? mdavis Linux - Security 5 04-16-2004 10:08 AM
output of printk can NOT be found in /var/log/messages linshu Linux - Software 1 02-13-2004 09:06 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 07:19 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration