formula on the 2.6 kernels to display the kernel memory usage

dazdaz · 05-28-2011, 05:58 PM

Hi,

what is the formula on the 2.6 kernels to display the kernel memory usage.

I am aware of tools top, slabtop, free, and /proc/meminfo and /proc/slabinfo but none of these tools or system files display an actual figure, so I presume that you need to manually calculate it.

This can be useful to detect kernel memory leaks or other system problems, and it's just interesting to know...

eightcien · 05-29-2011, 08:38 PM

For general memory leak detection perhaps try reading Documentation/kmemleak (in the linux source tree)? This type of debugging is not often enabled in distro kernels (as far as I know) and is mostly a tool for kernel developers.

For an idea of how much memory the kernel has allocated for its own use see the Slab: entry in /proc/meminfo.

However, the "one big number" is not often interesting when trying to debug a real problem or to optimize memory usage as the value depends on a huge number of factors. So the discrete values given by /proc/slabinfo and the stuff under /sys/kernel/slab are probably more useful -- but in such cases a script is often used to calculate the quantities of interest.

dazdaz · 05-29-2011, 10:47 PM

I was thinking of a quick formula within a shell function that could be used on a production Linux system whereby I cannot enable debugging (not allowed to change system on the fly), to see if the kernel memory usage was significantly out of whack.

I can say with 100% certainty that i've used a similar method with Solaris, as a rough indicator to detect kernel bugs, and found a serious problem with a buggy network network driver memory leaking which ultimately caused an outage on a production server.

The slab value in meminfo seems so small... it can't be the entire kernel memory usage. What are the huge number of factors that have to be taken into account ? :-)

eightcien · 05-30-2011, 12:20 AM

Quote:

The slab value in meminfo seems so small... it can't be the entire kernel memory usage.

It is not the "whole" memory usage.. But it does correspond to the kernels notion of a "heap", and where malloc()/free() style allocations happen. This is going to be a very good place to look when monitoring for memory leaks. Other types of memory "allocations" (like getting access to memory regions that correspond to the hardware like a video card) are often static in nature.

Quote:

What are the huge number of factors that have to be taken into account ? :-)

The number of processes running on the system and the type of workload they have are the main variables I guess, but that really does cover a pretty significant number of factors I think: For example, start with a relatively idle system and do a "grep 'Slab:' /proc/meminfo. Then start a job that touches a huge number of files (like a "grep -R foo /usr"). As that job runs look at how the 'Slab:' entry goes up.

In the other direction, write a little C program that tries to use %80 or more of the available memory on your system, then look at how the kernel memory usage goes down as it tries to free up cached data.

So, all I mean by the "huge number of factors" is that it is kinda hard to look at how much memory the kernel is using and make a meaningful decision that "this is due to a buggy driver" just because the number happens to be large.

I don't know what else to suggest except to measure /proc/slabinfo under typical load conditions and generate an email if the numbers get seriously out of whack? Perhaps write a script that does an "echo 1 > /proc/sys/vm/drop_caches" when the system load is low and compare the slab usage with some baseline? In thinking about this, I would love to learn of a better method myself!