LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-10-2013, 08:44 AM   #1
rainman1985_2010
Member
 
Registered: Oct 2010
Posts: 47

Rep: Reputation: 0
Question about page cache


Hi, everyone

I ran my program, which continuously read data from disk and transfer the data through network, on a computer with swapp off.

I found that the cached memory continuously increase, which led to my program ran out-of-memory.

Why is this happening?
Shouldn't the page cache be reclaimed when process require more memory while there is not enough free memory?
 
Old 05-10-2013, 08:48 AM   #2
mina86
Member
 
Registered: Aug 2008
Distribution: Debian
Posts: 517

Rep: Reputation: 229Reputation: 229Reputation: 229
Quote:
Originally Posted by rainman1985_2010 View Post
Shouldn't the page cache be reclaimed when process require more memory while there is not enough free memory?
It should. Does your program store data in memory by any chance and never frees them? You might have a memory leak.
 
Old 05-10-2013, 08:59 AM   #3
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by rainman1985_2010 View Post
I found that the cached memory continuously increase,
That is reasonable.

Quote:
which led to my program ran out-of-memory.
That is not consistent with cached memory high.

Quote:
Shouldn't the page cache be reclaimed when process require more memory while there is not enough free memory?
It should and it is. I'm sure you misunderstood what happened.

If you provide more details, maybe someone can make a better estimate of what really happened.

If your program is 32-bit, it might have run out of virtual address space rather than "memory".

Last edited by johnsfine; 05-10-2013 at 09:01 AM.
 
Old 05-10-2013, 10:09 AM   #4
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,642
Blog Entries: 4

Rep: Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933
You certainly did "misunderstand what happened."

"Cache" memory in this context is used to hold stuff that was recently read from disk files, as well as stuff that's nearby. It's the lowest-priority use of memory, done simply for efficiency and convenience. Linux will fill almost all available (read: "currently otherwise-unused") memory with it, rather than let the RAM go to waste. When actual memory pressure begins to appear, it will steal those pages first.

Personally, I would recommend always having some swap space assigned. IBM referred to "short-on-storage = SOS = ... --- ..." for a sly but pointed reason.
 
Old 05-10-2013, 12:34 PM   #5
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by sundialsvcs View Post
Personally, I would recommend always having some swap space assigned.
I also recommend having some swap space on any system with a decent size hard drive (it is less clear on systems with just SSD). But that question is a distraction in this thread.

We are not guessing lack of swap space was the problem in a Linux system with a large amount of ram allocated to cache. It is an obscure possibility. Lack of swap space can cause an out of memory failure due to over commit rules even when actual ram use (other than cache) is small. But I would not investigate that obscure condition without a lot more info to say it might be relevant to this case.
 
Old 05-10-2013, 12:47 PM   #6
mina86
Member
 
Registered: Aug 2008
Distribution: Debian
Posts: 517

Rep: Reputation: 229Reputation: 229Reputation: 229
Quote:
Originally Posted by johnsfine View Post
Lack of swap space can cause an out of memory failure due to over commit rules even when actual ram use (other than cache) is small.
Uh?

Also existence or lack of swap does not change anything – over committing may lead OOM killer to kill processes regardless of whether system has swap configured or not. The difference swap makes is that you'll now before it happens as suddenly your system is unusable and HDD spins like crazy.
 
Old 05-10-2013, 01:08 PM   #7
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by mina86 View Post
Also existence or lack of swap does not change anything
I didn't want to go down this side track, but now I don't want to leave the confusion that I think you added by responding to my side track.

"Out of memory" takes multiple forms. One is when the OOM killer kills some process because previously committed memory is used without being available. Another is when a process requests more memory to be committed and the kernel refuses that request (because it exceeds the current over commit rules). Typical processes have no fallback for a failure to commit memory, and thus the application fails due to "out of memory" (but not due to the OOM killer).

Assume the same set of processes doing the same things on two systems, each with a large amount of ram used by cache and zero or insignificant use of swap space. One system has significant swap configured. The other doesn't. Neither system can trigger the OOM killer (it would take pages from cache first). But the one without swap configured might refuse a memory commit (causing an application to abort due to lack of memory) when the system with swap would fill the identical request (without using any swap).

That means the existence of the swap space makes a difference even though the swap space is not actually used. This condition is common in Windows (where over commit rules are secret and apparently idiotic). This condition is rare, but possible, in Linux (where over commit rules are published, are under your control, and have sane defaults in most distributions).
 
Old 05-10-2013, 03:33 PM   #8
mina86
Member
 
Registered: Aug 2008
Distribution: Debian
Posts: 517

Rep: Reputation: 229Reputation: 229Reputation: 229
Quote:
Originally Posted by johnsfine View Post
But the one without swap configured might refuse a memory commit
As far as I can tell, Linux counts page cache as free when calculating overcommit ratio: http://lxr.linux.no/linux+*/mm/mmap.c#L137
 
Old 05-10-2013, 06:26 PM   #9
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by mina86 View Post
As far as I can tell, Linux counts page cache as free when calculating overcommit ratio:
Correct.

I hope you (and the OP) understand there is a difference between committing memory and using it. It is normal for a process to commit more memory than it uses. The default over commit rule allow the sum of committed memory over all processes to be more than the amount they could use. But it does not allow a single process to commit more than it could use.

It compares the total of free swap, plus most of "free" (including cache) ram to the sum of what that one process has already committed without using and the new amount that process wants to commit.

If a process has committed, but not used, a large amount of memory, then wants to commit more. It might fail due to lack of memory even though there is a large page cache (but not as large as the memory that process would have committed and unused). But if there were enough swap space free, that same request would not fail.

My best guess is that this is not the problem the OP saw. This is either a less likely possibility or just a bad side track. My best guess is what actually happended on the OP's system is very different from what was described in the initial post.
 
Old 05-11-2013, 11:46 PM   #10
rainman1985_2010
Member
 
Registered: Oct 2010
Posts: 47

Original Poster
Rep: Reputation: 0
Thanks everybody.

Actually, I ran HDFS on those linux machines. I used "-Xmx" and "-Xms" to assign datanode 25GB memory. But when hdfs used only 10GB memory, it throws out an Out-Of-Memory exception, and meanwhile, the "cached memory" is about "50GB"(the server I used has 64GB total memory), So I guess it's page cache that occupied the system memory
 
Old 05-11-2013, 11:50 PM   #11
rainman1985_2010
Member
 
Registered: Oct 2010
Posts: 47

Original Poster
Rep: Reputation: 0
By the way, it seems there's no such a parameter "cache_stop" in CentOS 6.3 which is the os that I use, is there any other way to stop page cache?
 
Old 05-12-2013, 05:50 AM   #12
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by rainman1985_2010 View Post
By the way, it seems there's no such a parameter "cache_stop" in CentOS 6.3 which is the os that I use, is there any other way to stop page cache?
You are still trying to fix the wrong thing. The page cache is not causing your problem.

Quote:
Originally Posted by rainman1985_2010 View Post
when hdfs used only 10GB memory,
What do you mean by "used" and how did you measure that?

The important number would be how much anonymous memory the process has mapped, which may be far higher than the 10GB you think it is using.

Now that you provided more information, the possibility I thought was obscure before, seems likely now.

I think some process is trying to commit a very large chunk of memory and aborting because of the over commit rules. If you had a large swap space configured, that same operation would succeed. A different over commit setting could also make that operation succeed.

A very important question is why the process is committing so much memory while using so little.

One reason might be that the process is attempting a sudden major increase in the amount of memory it actually uses. In that case, changing over commit rules to allow the commit would just delay the failure a trivial amount, then all those cache pages would get consumed, then the OOM killer would kill that process. Under the same conditions, if you solved the problem with a big swap area, the application would likely slow to a crawl (but might work OK. Swap performance can't be predicted so easily).

Maybe the process is leaking committed unused virtual address space. In that case anything that allows the commit would get you much further in normal operation. But if it is an ongoing resource leak, it ultimately will cause more problems.

Last edited by johnsfine; 05-12-2013 at 06:10 AM.
 
Old 05-12-2013, 11:06 PM   #13
rainman1985_2010
Member
 
Registered: Oct 2010
Posts: 47

Original Poster
Rep: Reputation: 0
Thanks for your quick reply, johnsfine

My linux machine's "overcommit_memory" param is set to 0. Does this mean over commit rule is not working?

And on the other hand, my program(HDFS) is running over jvm, and get an OOM exception. But the jvm itself is not aborting.

Can you give some advices about this?
 
Old 05-13-2013, 05:33 AM   #14
johnsfine
LQ Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,286

Rep: Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197Reputation: 1197
Quote:
Originally Posted by rainman1985_2010 View Post
my program(HDFS) is running over jvm, and get an OOM exception. But the jvm itself is not aborting.
I assumed all that from your previous post.

Quote:
Can you give some advices about this?
Configure a large swap file or partition and see what effect that has. Some memory allocation request is failing. I don't know how to intercept that request to understand it at the point of failure. But with a large swap space available, that same request might not fail. Then it should be possible to examine the mapping of the process and understand the problem.

Detailed information about memory mapping is in
/proc/pid/smaps
If you make a copy of that before the big allocation happens and compare it to after the big allocation happens, that should give the best shot at understanding the problem.

Also, the failing request is failing inside the JVM. You mentioned setting a limit on JVM memory allocations. I'm not expert enough in Java to be sure, but maybe that limit (rather than lack of swap) is the cause of the problem. HDFS may need more memory and you may need to configure the JVM to be willing to allocate that memory.
 
Old 05-13-2013, 07:04 AM   #15
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,642
Blog Entries: 4

Rep: Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933Reputation: 3933
Exactly. The system is designed to avoid going over the edge ... which means staying a calculated distance away from the edge at all times. By allowing for the possibility of swap space, you give the algorithm another escape hatch.

But still, the behavior of this process is "most unusual," and therefore undoubtedly "buggy."
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Difference between page cache and buffer cache Nayaka Linux - Kernel 5 09-28-2011 08:23 AM
Why RB trees are used in Linux’s page cache? Anoop Madhusoodhanan P Linux - General 2 10-11-2010 12:12 PM
Tweak Page Cache in Ubuntu Linux sulekha Ubuntu 4 12-10-2008 02:36 AM
better use of page cache with 64 bit processor panandsapphire Linux - Kernel 1 09-19-2006 10:05 AM
Where is page cache? zdz97 Linux - Hardware 2 09-16-2003 03:39 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:33 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration