Question about page cache

rainman1985_2010 · 05-10-2013, 08:44 AM

Hi, everyone

I ran my program, which continuously read data from disk and transfer the data through network, on a computer with swapp off.

I found that the cached memory continuously increase, which led to my program ran out-of-memory.

Why is this happening?
Shouldn't the page cache be reclaimed when process require more memory while there is not enough free memory?

mina86 · 05-10-2013, 08:48 AM

Quote:

Originally Posted by rainman1985_2010

Shouldn't the page cache be reclaimed when process require more memory while there is not enough free memory?

It should. Does your program store data in memory by any chance and never frees them? You might have a memory leak.

johnsfine · 05-10-2013, 08:59 AM

Quote:

Originally Posted by rainman1985_2010

I found that the cached memory continuously increase,

That is reasonable.

Quote:

which led to my program ran out-of-memory.

That is not consistent with cached memory high.

Quote:

Shouldn't the page cache be reclaimed when process require more memory while there is not enough free memory?

It should and it is. I'm sure you misunderstood what happened.

If you provide more details, maybe someone can make a better estimate of what really happened.

If your program is 32-bit, it might have run out of virtual address space rather than "memory".

sundialsvcs · 05-10-2013, 10:09 AM

You certainly did "misunderstand what happened."

"Cache" memory in this context is used to hold stuff that was recently read from disk files, as well as stuff that's nearby. It's the lowest-priority use of memory, done simply for efficiency and convenience. Linux will fill almost all available (read: "currently otherwise-unused") memory with it, rather than let the RAM go to waste. When actual memory pressure begins to appear, it will steal those pages first.

Personally, I would recommend always having some swap space assigned. IBM referred to "short-on-storage = SOS = ... --- ..." for a sly but pointed reason.

johnsfine · 05-10-2013, 12:34 PM

Quote:

Originally Posted by sundialsvcs

Personally, I would recommend always having some swap space assigned.

I also recommend having some swap space on any system with a decent size hard drive (it is less clear on systems with just SSD). But that question is a distraction in this thread.

We are not guessing lack of swap space was the problem in a Linux system with a large amount of ram allocated to cache. It is an obscure possibility. Lack of swap space can cause an out of memory failure due to over commit rules even when actual ram use (other than cache) is small. But I would not investigate that obscure condition without a lot more info to say it might be relevant to this case.

mina86 · 05-10-2013, 12:47 PM

Quote:

Originally Posted by johnsfine

Lack of swap space can cause an out of memory failure due to over commit rules even when actual ram use (other than cache) is small.

Uh?

Also existence or lack of swap does not change anything – over committing may lead OOM killer to kill processes regardless of whether system has swap configured or not. The difference swap makes is that you'll now before it happens as suddenly your system is unusable and HDD spins like crazy.

johnsfine · 05-10-2013, 01:08 PM

Quote:

Originally Posted by mina86

Also existence or lack of swap does not change anything

I didn't want to go down this side track, but now I don't want to leave the confusion that I think you added by responding to my side track.

"Out of memory" takes multiple forms. One is when the OOM killer kills some process because previously committed memory is used without being available. Another is when a process requests more memory to be committed and the kernel refuses that request (because it exceeds the current over commit rules). Typical processes have no fallback for a failure to commit memory, and thus the application fails due to "out of memory" (but not due to the OOM killer).

Assume the same set of processes doing the same things on two systems, each with a large amount of ram used by cache and zero or insignificant use of swap space. One system has significant swap configured. The other doesn't. Neither system can trigger the OOM killer (it would take pages from cache first). But the one without swap configured might refuse a memory commit (causing an application to abort due to lack of memory) when the system with swap would fill the identical request (without using any swap).

That means the existence of the swap space makes a difference even though the swap space is not actually used. This condition is common in Windows (where over commit rules are secret and apparently idiotic). This condition is rare, but possible, in Linux (where over commit rules are published, are under your control, and have sane defaults in most distributions).

mina86 · 05-10-2013, 03:33 PM

Quote:

Originally Posted by johnsfine

But the one without swap configured might refuse a memory commit

As far as I can tell, Linux counts page cache as free when calculating overcommit ratio: http://lxr.linux.no/linux+*/mm/mmap.c#L137

johnsfine · 05-10-2013, 06:26 PM

Quote:

Originally Posted by mina86

As far as I can tell, Linux counts page cache as free when calculating overcommit ratio:

Correct.

I hope you (and the OP) understand there is a difference between committing memory and using it. It is normal for a process to commit more memory than it uses. The default over commit rule allow the sum of committed memory over all processes to be more than the amount they could use. But it does not allow a single process to commit more than it could use.

It compares the total of free swap, plus most of "free" (including cache) ram to the sum of what that one process has already committed without using and the new amount that process wants to commit.

If a process has committed, but not used, a large amount of memory, then wants to commit more. It might fail due to lack of memory even though there is a large page cache (but not as large as the memory that process would have committed and unused). But if there were enough swap space free, that same request would not fail.

My best guess is that this is not the problem the OP saw. This is either a less likely possibility or just a bad side track. My best guess is what actually happended on the OP's system is very different from what was described in the initial post.

rainman1985_2010 · 05-11-2013, 11:46 PM

Thanks everybody.

Actually, I ran HDFS on those linux machines. I used "-Xmx" and "-Xms" to assign datanode 25GB memory. But when hdfs used only 10GB memory, it throws out an Out-Of-Memory exception, and meanwhile, the "cached memory" is about "50GB"(the server I used has 64GB total memory), So I guess it's page cache that occupied the system memory

rainman1985_2010 · 05-11-2013, 11:50 PM

By the way, it seems there's no such a parameter "cache_stop" in CentOS 6.3 which is the os that I use, is there any other way to stop page cache?

johnsfine · 05-12-2013, 05:50 AM

Quote:

Originally Posted by rainman1985_2010

By the way, it seems there's no such a parameter "cache_stop" in CentOS 6.3 which is the os that I use, is there any other way to stop page cache?

You are still trying to fix the wrong thing. The page cache is not causing your problem.

Quote:

Originally Posted by rainman1985_2010

when hdfs used only 10GB memory,

What do you mean by "used" and how did you measure that?

The important number would be how much anonymous memory the process has mapped, which may be far higher than the 10GB you think it is using.

Now that you provided more information, the possibility I thought was obscure before, seems likely now.

I think some process is trying to commit a very large chunk of memory and aborting because of the over commit rules. If you had a large swap space configured, that same operation would succeed. A different over commit setting could also make that operation succeed.

A very important question is why the process is committing so much memory while using so little.

One reason might be that the process is attempting a sudden major increase in the amount of memory it actually uses. In that case, changing over commit rules to allow the commit would just delay the failure a trivial amount, then all those cache pages would get consumed, then the OOM killer would kill that process. Under the same conditions, if you solved the problem with a big swap area, the application would likely slow to a crawl (but might work OK. Swap performance can't be predicted so easily).

Maybe the process is leaking committed unused virtual address space. In that case anything that allows the commit would get you much further in normal operation. But if it is an ongoing resource leak, it ultimately will cause more problems.

rainman1985_2010 · 05-12-2013, 11:06 PM

Thanks for your quick reply, johnsfine

My linux machine's "overcommit_memory" param is set to 0. Does this mean over commit rule is not working?

And on the other hand, my program(HDFS) is running over jvm, and get an OOM exception. But the jvm itself is not aborting.

Can you give some advices about this?

johnsfine · 05-13-2013, 05:33 AM

Quote:

Originally Posted by rainman1985_2010

my program(HDFS) is running over jvm, and get an OOM exception. But the jvm itself is not aborting.

I assumed all that from your previous post.

Quote:

Can you give some advices about this?

Configure a large swap file or partition and see what effect that has. Some memory allocation request is failing. I don't know how to intercept that request to understand it at the point of failure. But with a large swap space available, that same request might not fail. Then it should be possible to examine the mapping of the process and understand the problem.

Detailed information about memory mapping is in
/proc/pid/smaps
If you make a copy of that before the big allocation happens and compare it to after the big allocation happens, that should give the best shot at understanding the problem.

Also, the failing request is failing inside the JVM. You mentioned setting a limit on JVM memory allocations. I'm not expert enough in Java to be sure, but maybe that limit (rather than lack of swap) is the cause of the problem. HDFS may need more memory and you may need to configure the JVM to be willing to allocate that memory.

sundialsvcs · 05-13-2013, 07:04 AM

Exactly. The system is designed to avoid going over the edge ... which means staying a calculated distance away from the edge at all times. By allowing for the possibility of swap space, you give the algorithm another escape hatch.

But still, the behavior of this process is "most unusual," and therefore undoubtedly "buggy."