LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 08-23-2010, 11:32 PM   #1
dico
LQ Newbie
 
Registered: Aug 2010
Posts: 14

Rep: Reputation: 0
Committed Memory keep on increasing


Hi all,
I'm a newby in Linux, so please be patient with me...

I got a legacy server, and through munin I notice that periodically commited memory keep on increasing. The chart looks like a spiky sine graph, but every cycle it increase both max and minimum. After few days it will go over my physical memory it will keep on growing until the machine crash...
Usually I reset the memory by rebooting the machine which is not a solution at all.
  1. Is there any way to know what process(es) that took the committed memory?
  2. Is there any way to release the committed memory without rebooting the machine?
  3. Is there any way to cure my machine from this disease?

as an information, I'm using CENTOS 5.5 i686,
AMD Athlon(tm) 64 X2 Dual Core Processor 3800+

this is what I get from /proc/meminfo
MemTotal: 3499776 kB
CommitLimit: 10023760 kB

currently
Committed_AS: 3972804 kB
which is bigger than my MemTotal


sorry for my bad language,
thank you and my regards,

Dico

Last edited by dico; 08-23-2010 at 11:39 PM.
 
Old 08-24-2010, 06:58 AM   #2
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,044

Rep: Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100
Quote:
Originally Posted by dico View Post
it will go over my physical memory it will keep on growing until the machine crash.
It is not at all serious that committed memory goes higher than physical memory.

If it grows far enough to crash the system, obviously that is a real problem.

You may want to delay the crash by adding swap space. It is easier to add a swap file that to add or extend a swap partition, especially if the added swap space is only temporary during diagnosis of the problem.

Quote:
Is there any way to know what process(es) that took the committed memory?
I don't know a good general way. But I do know that the committed memory in a process is a subset of the memory reported in the VIRT column of top. If you have any processes with unreasonably high committed memory, they must also have unreasonably high VIRT values.

Run top, then press F and o to show the processes with very high VIRT values. Then investigate the memory use of those processes.

Quote:
Is there any way to release the committed memory without rebooting the machine?
Just kill the processes that commit too much memory.

Quote:
Is there any way to cure my machine from this disease?
Track down and fix the memory leak.

That is one reason you might want to increase swap space: You don't know which of the processes with very high VIRT values have legitimate reasons for that use, vs. which processes might have memory leaks. Higher swap space means you can let any leak run much longer and grow much bigger, so it is easier to distinguish a memory leak from legitimate memory use.
Quote:
MemTotal: 3499776 kB
I assume you have 4GB physically installed.

If you switch to a PAE kernel, you will probably get to use most of the .5 GB that is missing from the above MemTotal.

That won't fix your apparent memory leak, but it might make a noticeable improvement in performance.

Last edited by johnsfine; 08-24-2010 at 07:05 AM.
 
1 members found this post helpful.
Old 08-24-2010, 09:48 AM   #3
dico
LQ Newbie
 
Registered: Aug 2010
Posts: 14

Original Poster
Rep: Reputation: 0
@johnsfine: Thank you so so much for your reply. It helps a lot.

Code:
top - 09:43:32 up 3 days, 19:20,  1 user,  load average: 0.55, 0.64, 0.68
Tasks: 165 total,   1 running, 163 sleeping,   0 stopped,   1 zombie
Cpu(s): 11.2%us,  1.8%sy,  0.0%ni, 86.4%id,  0.0%wa,  0.2%hi,  0.3%si,  0.0%st
Mem:   3499776k total,  3026120k used,   473656k free,    62820k buffers
Swap:  8273872k total,      124k used,  8273748k free,  2146416k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
22177 nobody    16   0  136m  98m 6952 S  9.6  2.9   0:05.72 httpd
22126 nobody    16   0  138m  99m 7016 S  8.6  2.9   0:06.25 httpd
22263 nobody    15   0  128m  92m 9180 S  7.6  2.7   0:05.34 httpd
22720 nobody    17   0  130m  89m 4780 S  6.0  2.6   0:00.66 httpd
 5173 mysql     15   0  104m  77m 3456 S  5.3  2.3 309:04.49 mysqld
22819 nobody    15   0  117m  76m 3908 S  2.7  2.2   0:00.12 httpd
22316 nobody    15   0  121m  91m  15m S  2.3  2.7   0:01.96 httpd
22701 nobody    15   0  123m  86m 8964 S  2.0  2.5   0:02.19 httpd
22719 nobody    15   0  120m  81m 6440 S  2.0  2.4   0:00.99 httpd
22731 nobody    15   0  119m  79m 4640 S  2.0  2.3   0:00.28 httpd
22694 nobody    15   0  132m  92m 5376 S  1.7  2.7   0:01.54 httpd
22262 nobody    16   0  133m  95m 6996 S  1.3  2.8   0:02.58 httpd
22817 nobody    15   0  132m  91m 4708 S  1.0  2.7   0:00.63 httpd
So it's httpd and mysql that cause the problem.
Have to find a way to stop the memory leak

about switching to a PAE kernel, I'll look forward to it.
Again, thank you.

Regards,

Dico
 
Old 08-24-2010, 10:50 AM   #4
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,044

Rep: Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100
How many httpd processes do you have and why do you have so many?

Quote:
Originally Posted by dico View Post
So it's httpd and mysql that cause the problem.
You posted top output still sorted by CPU. You didn't use F o to sort by VIRT? Or it didn't work or what?

Last edited by johnsfine; 08-24-2010 at 10:54 AM.
 
Old 08-24-2010, 09:16 PM   #5
dico
LQ Newbie
 
Registered: Aug 2010
Posts: 14

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by johnsfine View Post
How many httpd processes do you have and why do you have so many?
I have a lot. At least using TOP I see 33 of it.
We have so many httpd because we run a web server that host approximately 30 sites...

Quote:
Originally Posted by johnsfine View Post
You posted top output still sorted by CPU. You didn't use F o to sort by VIRT? Or it didn't work or what?
It's not working. When I press F I get a list of fields. But when I press O the only thing that happens is
Code:
* O: VIRT       = Virtual Image (kb)
changed to
Code:
  o: VIRT       = Virtual Image (kb)
and vice versa...
So it's toggling what column to show. So if I turn everything off but VIRT, this is what I get
Code:
top - 21:18:24 up 4 days,  6:55,  1 user,  load average: 1.04, 0.94, 0.90
Tasks: 193 total,   3 running, 189 sleeping,   0 stopped,   1 zombie
Cpu(s): 15.7%us,  2.5%sy,  0.0%ni, 80.6%id,  0.3%wa,  0.0%hi,  0.8%si,  0.0%st
Mem:   3499776k total,  3301140k used,   198636k free,    98264k buffers
Swap:  8273872k total,      124k used,  8273748k free,  2471328k cached

 VIRT COMMAND
 125m httpd
 134m httpd
 137m httpd
 136m httpd
 129m httpd
 133m httpd
 134m httpd
 135m httpd
but still not sorted....

Last edited by dico; 08-24-2010 at 09:33 PM.
 
Old 08-24-2010, 09:36 PM   #6
dico
LQ Newbie
 
Registered: Aug 2010
Posts: 14

Original Poster
Rep: Reputation: 0
Ah, found it. You mean O and o.

this is what I get

Code:
top - 21:34:42 up 4 days,  7:11,  1 user,  load average: 0.62, 0.74, 0.80
Tasks: 166 total,   1 running, 162 sleeping,   0 stopped,   3 zombie
Cpu(s): 15.1%us,  6.5%sy,  0.0%ni, 76.8%id,  0.2%wa,  0.2%hi,  1.3%si,  0.0%st
Mem:   3499776k total,  3220068k used,   279708k free,    99244k buffers
Swap:  8273872k total,      124k used,  8273748k free,  2408092k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5757 tomcat    18   0  205m 108m 5404 S  1.0  3.2  17:09.33 jsvc
19176 nobody    18   0  142m  99m 4520 S  0.0  2.9   0:00.43 httpd
18688 nobody    15   0  136m  96m 6600 S  0.0  2.8   0:26.53 httpd
18497 nobody    15   0  135m  94m 5344 S  1.0  2.8   0:02.81 httpd
19078 nobody    15   0  134m  94m 4992 S  0.0  2.8   0:00.78 httpd
19171 nobody    15   0  133m  92m 4696 S  0.0  2.7   0:00.96 httpd
19084 nobody    15   0  133m 101m  13m S  0.0  3.0   0:03.53 httpd
18526 nobody    15   0  133m 101m  12m S  0.0  3.0   0:02.75 httpd
19181 nobody    15   0  133m  92m 4640 S  0.7  2.7   0:00.41 httpd
19105 nobody    15   0  133m  93m 5280 S  0.0  2.7   0:01.23 httpd
19097 nobody    15   0  133m  99m  12m S  0.0  2.9   0:02.23 httpd
19111 nobody    15   0  132m  99m  11m S  0.0  2.9   0:00.93 httpd
19115 nobody    15   0  132m 100m  12m S  0.3  2.9   0:02.06 httpd
19102 nobody    15   0  132m  91m 4984 S  0.0  2.7   0:02.02 httpd
19080 nobody    15   0  132m  92m 5292 S  0.0  2.7   0:01.59 httpd
19173 nobody    18   0  132m  91m 4940 S  0.0  2.7   0:01.56 httpd
 
Old 08-25-2010, 06:40 AM   #7
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,044

Rep: Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100
Quote:
Originally Posted by dico View Post
We have so many httpd because we run a web server that host approximately 30 sites...
It looks like your 33 copies of httpd together commit slightly over 4GB of memory.

As I said before, it is OK that this amount is more than your physical memory.

So you still don't know what makes committed memory grow far beyond that. You have 8GB of swap space, so committed memory levels below 8GB should be no problem. The required working set of all your processes is another issue. That may grow with committed memory and if it is above the 3.5GB of physical memory, your system will get significantly slower.

But regarding your main issue, you apparently still haven't determined where the extra committed memory is nor whether it is legitimate memory use vs. some kind of leak. You need to use top after committed memory has grown significantly past 4GB but before that makes the system fail.

Do you then have more than 33 copies of httpd? Or are they each larger than the they were when only 4GB was committed? Or are them some other processes with high values of VIRT?

Quote:
It's not working. When I press F I get a list of fields. But when I press O the only thing that happens is
That is what would happen if you pressed f instead of F.

Quote:
Originally Posted by dico View Post
Ah, found it. You mean O and o.
O also works for the first key, but I'm pretty sure F works.

That second keystroke is case insensitive: O or o should work equally. But the first keystroke is case sensitive: F or O should do the right thing and f or o should do the wrong thing.

Last edited by johnsfine; 08-25-2010 at 06:46 AM.
 
Old 08-25-2010, 11:12 AM   #8
dico
LQ Newbie
 
Registered: Aug 2010
Posts: 14

Original Poster
Rep: Reputation: 0
Again, Thanks for your replies

Another instance take take a big amount of VIRT is only tomcat. But from earlier this day, it stays at 205MB. In other hand Apache's instance decreased to about 20. And when I look at my munin chart, Apache's request chart always sync with committed memory. When Apache's request increase, so will the committed memory. The only problem is as time goes by, Apache need more and more committed memory for each of it's instance. See this charts bellow:

Weekly Apache Chart -- Weekly Memory Chart

The gap that you see in the chart is when the machine crash and we reboot it. So, From TOP that you've told me it is Apache that takes most VIRT, and looking by Munin's chart Committed memory always sync with Apache, so I think it's Apache that somehow leaks.

Quote:
Originally Posted by johnsfine View Post
That is what would happen if you pressed f instead of F.
O also works for the first key, but I'm pretty sure F works.
That second keystroke is case insensitive: O or o should work equally. But the first keystroke is case sensitive: F or O should do the right thing and f or o should do the wrong thing.
Sorry, my bad. F give me the same result with O. I think previously I pressed f instead of F.

Last edited by dico; 08-25-2010 at 11:14 AM.
 
Old 08-25-2010, 11:44 AM   #9
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,044

Rep: Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100
I think you are misinterpreting some important aspect of your problem.

Quote:
Originally Posted by dico View Post
commited memory keep on increasing. The chart looks like a spiky sine graph, but every cycle it increase both max and minimum.
That might be normal correct behavior (because you are running many instances of httpd and most of them are very lightly loaded most of the time).

Quote:
it will keep on growing until the machine crash...
It "will"?? Or it did?

Can you provide more info about that crash? I don't think your increasing level of committed memory would cause a crash.

I looked at the memory-week.jpg you just posted and see something very significant:

The max committed it lists is 26G, but the max swap usage it lists is 12M. That means almost all the committed memory is not actually used. As long as there is a large amount of free swap space, the level of committed memory will not have any significant effect on the system behavior. It certainly shouldn't cause a crash.

So if something does build up to a crash, it isn something other that committed memory.
 
Old 08-25-2010, 10:16 PM   #10
dico
LQ Newbie
 
Registered: Aug 2010
Posts: 14

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by johnsfine View Post
I think you are misinterpreting some important aspect of your problem.
I can misinterpreting almost anything. I have so little experience about Linux servers.

Quote:
Originally Posted by johnsfine View Post
That might be normal correct behavior (because you are running many instances of httpd and most of them are very lightly loaded most of the time).
I'm puzzled, if it's normal, why committed memory keep on increased per time period? And it will only goes down after system reboot (will restart apache service works too?) As a note, this memory problem started few months ago. Look at our monthly and yearly chart.
Each time committed memory back to the lowest level means that we've just reboot machine. But whether it's committed memory that caused the machine crashed or not. I'm not sure.

Quote:
Originally Posted by johnsfine View Post
It "will"?? Or it did?
Can you provide more info about that crash? I don't think your increasing level of committed memory would cause a crash.
I'm not sure too. Usually we do remote reboot through SSH when we notice that our sites went down and committed memory became scary. Services went down but system still up (rebooting services brought no luck so we rebooted the machine -- which made committed memory back to lowest level again). But this last crash is a bit different. The system really stopped (machine hung), and we have to push the hard reset button. And CPU chart shows something different. If you look at that orange gap, that started when we previously reboot the machine, and gone after the last machine reboot (when the system hung). I have no idea whether it related to anything or not.

Quote:
Originally Posted by johnsfine View Post
I looked at the memory-week.jpg you just posted and see something very significant:

The max committed it lists is 26G, but the max swap usage it lists is 12M. That means almost all the committed memory is not actually used. As long as there is a large amount of free swap space, the level of committed memory will not have any significant effect on the system behavior. It certainly shouldn't cause a crash.
I'll note this.

Quote:
Originally Posted by johnsfine View Post
So if something does build up to a crash, it isn something other that committed memory.
I've look at /var/log/messages and it log nothing suspicious. Anywhere else I should look?

Thanks for your trouble and your patience answering me.

Regards,

Dico
 
Old 08-26-2010, 09:00 AM   #11
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,044

Rep: Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100Reputation: 1100
Quote:
Originally Posted by dico View Post
I'm puzzled, if it's normal, why committed memory keep on increased per time period? And it will only goes down after system reboot
It is normal for processes to allocate far more memory than they actually use. It is also normal for processes to keep memory allocations after they are done with them.

Each of your httpd processes may be so lightly loaded that it takes weeks for it to reach a normal operating level of memory allocation.

Quote:
whether it's committed memory that caused the machine crashed or not. I'm not sure.
I'm not sure whether committed memory is unrelated to your serious problem or whether it is another symptom of that problem. I'm pretty sure it is not in the main sequence of cause and effect from the problem to the crash. I'm confident that a high level of committed memory does no harm as long as you also have a high level of free swap space. It appears that you have never used half your swap space.

Quote:
Anywhere else I should look?
I think top and other measures of process and system memory use are good places to look. The trick is when to look.

You need to look when the system is a lot closer to being in trouble.

In case the problem is in kernel virtual memory limits, you should also capture a few copies of /proc/slabinfo and see which of those resource uses is growing significantly over time.
 
Old 08-27-2010, 01:42 AM   #12
dico
LQ Newbie
 
Registered: Aug 2010
Posts: 14

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by johnsfine View Post
I think top and other measures of process and system memory use are good places to look. The trick is when to look.

You need to look when the system is a lot closer to being in trouble.

In case the problem is in kernel virtual memory limits, you should also capture a few copies of /proc/slabinfo and see which of those resource uses is growing significantly over time.
Ok, I'll take a look at it and keep you posted. Hopefully I can catch it at the right time.

Thanks for your help
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
maximum committed memory? eep6sa1@ucy.ac.cy Programming 3 03-06-2010 11:33 AM
How can I know Ram Committed memory in AIX nagendrar AIX 1 12-17-2009 07:37 PM
Memory usage constantly increasing reverse Debian 1 11-11-2007 03:15 AM
kernel: journal_get_undo_access: No memory for committed data KevWal Red Hat 3 10-20-2005 03:15 AM
increasing buffer size in memory slice Linux - General 1 09-30-2004 08:10 AM


All times are GMT -5. The time now is 05:39 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration