ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am working with an embedded system running Linux. I have a single process that has numerous threads doing the work... so there is really only a single process to worry about for all practical purposes. What I am trying to do is have this process offer IP based services as long as it has enough memory.
So for example, if my system has 8G, I want to keep allowing new connections to the services as long as there is at least 1G of free memory... but if there is less than that I want to reject new services... but continue allowing the existing connections.
Here is my problem... my process currently figures out what the current memory utilization is by looking as RSS. This works... the calculation is close enough for my purposes (yes, I know, shared memory can mess with this)... still, the calculation is fine. The BIG issue is... the RSS value seems to be a "high water mark". That is, it never shrinks since there are no other processes requesting memory... so my system stops accepting new connections even as old ones quit.
What are my options here? Is my only option to mess with malloc hooks or is there a better way to deal with this? Can I somehow "flush" my process so it gives back all unused RAM?
MrFenwayFool
Last edited by mrfenwayfool; 02-02-2013 at 01:06 PM.
I tried using drop_caches but I don't think that does anything to help me. The issue is that the process heap keeps expanding... sbrk() is never called to shrink the heap under normal circumstances. The result of this is that RSS shows as a "high water mark". For example, even though my process heap grows to 8G in size... that does NOT mean that my process is using ALL of that 8G at any given instant.
What I have found is that calling malloc_trim(0) every now and then from the application does indeed shrink the heap. I am playing with this but it seems calling that frequently really hits performance (there's a good reason the heap normally doesn't shrink!). Instead, I'm looking at mallinfo() which reports not only how large the heap is but also how much of the heap is actively allocated... which is really what I am really interested in.
The approach that I would take is simply to determine how many connections you think the system should accept, and build a simple but adjustable throttle to that effect.
You can sometimes do simple things like "wrapping" the memory allocation/release calls that your application uses with code that increments or decrements a global running total ... often faster than asking the memory-allocation subsystem for that figure.
You could design a simple servo-mechanism: a thread that wakes up every (configurable... n) seconds and decides whether the throttle should be raised or lowered, within (configurable) boundaries and by some (configurable) increment. This approach would make the system become responsive to conditions as they actually are observed to take place, and reduces the need to "guess correctly."
Yes, there's a reason why the segment doesn't shrink: it forces everything to grind to a stop while a heck of a lot of page-faults take place in quick succession. Instead, the system as-designed justs the virtual memory manager do its thing. Pages that are "part of the heap," but not actively being used, get paged out. Some memory-allocators use a special "zero-fill this block of memory" system call which gives the OS the opportunity to zero the page by removing it from the VM page-tables, but I don't recall if Linux does this or not.
Last edited by sundialsvcs; 02-05-2013 at 09:16 AM.
The approach that I would take is simply to determine how many connections you think the system should accept, and build a simple but adjustable throttle to that effect.
You can sometimes do simple things like "wrapping" the memory allocation/release calls that your application uses with code that increments or decrements a global running total ... often faster than asking the memory-allocation subsystem for that figure.
You could design a simple servo-mechanism: a thread that wakes up every (configurable... n) seconds and decides whether the throttle should be raised or lowered, within (configurable) boundaries and by some (configurable) increment. This approach would make the system become responsive to conditions as they actually are observed to take place, and reduces the need to "guess correctly."
Yes, there's a reason why the segment doesn't shrink: it forces everything to grind to a stop while a heck of a lot of page-faults take place in quick succession. Instead, the system as-designed justs the virtual memory manager do its thing. Pages that are "part of the heap," but not actively being used, get paged out. Some memory-allocators use a special "zero-fill this block of memory" system call which gives the OS the opportunity to zero the page by removing it from the VM page-tables, but I don't recall if Linux does this or not.
Hmmm... this has turned into a long and winding road.
Looking at malloc_stats output has enlightened me a bit on the memory usage. That is, there are multiple arenas of memory since libc is trying to prevent blocking due to multiple threads running (my particular application has many, many threads due to its long history). Anyway, I thought perhaps I could limit the number of arenas using MALLOC_ARENA_MAX... and while that did limit the arenas... I'm still seeing "in use" memory for each arena far below the "system space"... meaning the arenas are growing even though there *should* be plenty of memory available. I'm wondering if this is fragmentation or fastbins using up memory... then being freed... and never used again. My application is a massive C++ application. I'm going to look into this more closely. Seems like my application is not using as much memory as I thought if you can believe the statistics from malloc_stats. I'm also wondering if I should give tcmalloc a try and see if it has similar issues. Regardless... something is not right here.
Be very alert as to exactly which metric you are observing. When space is released and available for recycling, many metrics do not go down. Also, processes in a virtual memory environment can't see what the operating system's VM manager is doing (although the software is designed to be friendly to it).
Easily the best way to get meaningful statistics is to arrange for the process, itself, to create log datasets (or externally-visible in-memory "trace tables") which describe the workflow characteristics in its own, application specific, therefore this-business-specific terms. Get out of the abstract and into the concrete. If the processing system tracks units-of-work, track those units. Track easily-gathered memory manager statistics and include them in the same or in a separate stream (taking care that none of the stats are expensive to obtain, so as to skew the results by virtue of gathering them). Now, pull these data out to a statistics system: look for statistically significant correlations.
Build "tweakable" controls and knobs, not so much in operating-system terms but rather in "what this software system does for a living" terms. Gather data about the system with various settings and, once again, use a stats package.
I've done this sort of thing many times, and one of the most interesting things that comes from it is: "real surprises."
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.