Memory reported as used, but can't account for what's using it.
Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Memory reported as used, but can't account for what's using it.
My memory usage, as reported by free is creeping up with time, before eventually plateauing. Free reported 2,495Gb used, and 1,337Gb free. The summary in top seemed to concur with these values.
However if I add up the memory used by various process I cannot account for anywhere near that. Using ps_mem.py http://www.pixelbeat.org/scripts/ps_mem.py , I can only account for 763Mb of RAM usage.
I am taking the free memory value from the -/+ buffers/cache line, so this doesn't seem the widely encountered issue where people sometimes mistakenly read it from the first liine, which includes the buffering+caching: http://www.linuxatemyram.com/
To see if the RAM reported as used by free, was actually available for use I used the "C munch program" at: http://www.linuxatemyram.com/play.html to attempt to allocate 3000Mb of memory, far more than free suggested was available, but what I thought should be available based upon the memory accounted for by ps_mem.py.
This displaced some stuff into swap. But after the memory allocation program exited, free now reports much lower memory usage, which is far closer to what I would expect based upon ps_mem output
Mem + swap before munch 2589
Mem + swap after munch 1184
This is on a headless server, with a minimal install of Redhat Enterprise Linux 6.2, the only software of any significance that we have running is a source build of Apache httpd 2.4, and three instances of tomcat.
If anyone can shed any light as to what's going on with the memory usage I would be very grateful. I have included the output of free, top (sorted by memory usage) and ps_mem.py below.
Thanks for the reply. Are you suggesting I misunderstood or missed something on the LinuxAteMyRam site, as I already said:
I am taking the free memory value from the -/+ buffers/cache line, so this doesn't seem the widely encountered issue where people sometimes mistakenly read it from the first liine, which includes the buffering+caching: linuxatemyram
That was a totally pointless post - you are doing fine trying to chase this.
Memory allocation is a can of worms. Reading that script is a good start, but there is no good way to completely account for memory. Even the kernel devs have argued about this for years. "pss" was the best they could come up with for shared pages, and even that is (extremely) "rubbery".
Then there is the caches (did you try drop_caches as suggested on the linuxatemyram site ?). Not to mention the buddy slab allocator and its requirements (see /proc/meminfo and /proc/slabinfo).
Thanks syg00, your post has helped me a great deal, and got me moving in the right direction (I think).
Quote:
Originally Posted by syg00
did you try drop_caches as suggested on the linuxatemyram site
I hadn't tried that... I assumed (wrongly) that it would only clear out the stuff listed by free in buffers + cache, in addition to that it made the memory reported as used by free and psmem balance (at least near enough).
I have to confess I know virtually nothing about how memory allocation works. Googling my issue has been problematic as all I manged to find was the typical LinuxAteMyRAM issue of mis-reading free's output.
I've included below the results of drop_caches, plus /proc/meminfo before and after the drop_caches. I have no experience of interpreting these, but comparing the before and after values, the memory discrepancy was probably accounted for by the SReclaimable line in /proc/meminfo . Some googling tells me this is "a cache of in-kernel data structures" (http://stackoverflow.com/questions/5...ng-discrepancy).
Having wiped it out pretty recently, the discrepancy is only about 5% of my memory so far, but I can already see that dentry appears to account for most of the slabs, and be growing quickly.
My hunch is that it's apache HTTPD (as this seems to be the common factor where I've seen this major reporting discrepancy), but I'm not sure how I can tie these slab entires to particular applications.
I'm also still unclear about whether this is normal and I should be unconcerned, as the OS seems to be able to free it if it's needed, or whether it suggests something is wrong with my build of apache (assuming that is the culperate).
Any other thoughts / advice on the subject would be greatly appreciated.
My memory usage, as reported by free is creeping up with time, before eventually plateauing
I'm also still unclear about whether this is normal and I should be unconcerned, as the OS seems to be able to free it if it's needed, or whether it suggests something is wrong with my build of apache (assuming that is the culperate).
The main point that linuxatemyram tries to make is that for *nix, the OS will use all the RAM it can find, which is the efficient thing to do and a waste of RAM/money if it doesn't.
It has a very efficient scheduler that swaps out or otherwise recycles RAM depending on 'most urgent' ie current requirement.
As syg00 points out, you're welcome to investigate this, but it could send you insane ...
Unless you've got a definite performance issue, I'd regard it as an academic exercise for when you're bored
Hmmmm - where to start. First off, use [code] tags when posting output - it keeps the alignment. If it's easier to read, people are more likely to try and help.
Quote:
Originally Posted by Paul Norwich
I have to confess I know virtually nothing about how memory allocation works.
Welcome to the club - with the exception of a hand-full of people, we're all in the same boat. It's complex, tedious to investigate and bloody near impossible to understand at the source code level.
Quote:
My hunch is that it's apache HTTPD (as this seems to be the common factor where I've seen this major reporting discrepancy), but I'm not sure how I can tie these slab entires to particular applications.
I'm also still unclear about whether this is normal and I should be unconcerned, as the OS seems to be able to free it if it's needed, or whether it suggests something is wrong with my build of apache (assuming that is the culperate).
You can't tie slab entries to processes - a slab is potentially shared between allocation requests from multiple processes. It is possible to "reverse map" a virtual address back to a real page if the page is not swapped and not unallocated.
You don't want to go there ...
There have been some issues (memory leak essentially) with the slab allocator, but that was a few years back, and certainly won't be applicable to RHEL 6.2. So I would consider this as "working as designed" and just accept the situation if it isn't actually causing a problem. Poorly written/designed apps can drive memory fragmentation, which causes more slab cache entries to be allocated (which eats RAM). I've no idea whether apache would be guilty like this, but it certainly has a history of disregarding the overall system health - much like every database system out there.
But I digress ...
Thanks to everyone for there comments, sorry for the delay in replying (have been on holiday).
Unfortunately this isn't purely an academic exercise for me.... I'm "losing" up to 2.5Gb RAM on some servers. Whilst it does seem to be possible to free the memory (as I described in previous posts), it results in all the memory being unavailable for disk caching, which results in some quite significant performance degradation on our web servers. As the memory usage plateaus, I could just throw more memory at the problem, and have done this in the case of one server already. But doing this across 30-50 servers does have cost implications - which I don't mind if that much memory is genuinely needed, but would like to better understand what's going on, and try to rule out a software bug that might be fixable rather than upgrading a lot of servers.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.