Quote:
The slab value in meminfo seems so small... it can't be the entire kernel memory usage.
|
It is not the "whole" memory usage.. But it does correspond to the kernels notion of a "heap", and where malloc()/free() style allocations happen. This is going to be a very good place to look when monitoring for memory leaks. Other types of memory "allocations" (like getting access to memory regions that correspond to the hardware like a video card) are often static in nature.
Quote:
What are the huge number of factors that have to be taken into account ? :-)
|
The number of processes running on the system and the type of workload they have are the main variables I guess, but that really does cover a pretty significant number of factors I think: For example, start with a relatively idle system and do a "grep 'Slab:' /proc/meminfo. Then start a job that touches a huge number of files (like a "grep -R foo /usr"). As that job runs look at how the 'Slab:' entry goes up.
In the other direction, write a little C program that tries to use %80 or more of the available memory on your system, then look at how the kernel memory usage goes down as it tries to free up cached data.
So, all I mean by the "huge number of factors" is that it is kinda hard to look at how much memory the kernel is using and make a meaningful decision that "this is due to a buggy driver" just because the number happens to be large.
I don't know what else to suggest except to measure /proc/slabinfo under typical load conditions and generate an email if the numbers get seriously out of whack? Perhaps write a script that does an "echo 1 > /proc/sys/vm/drop_caches" when the system load is low and compare the slab usage with some baseline? In thinking about this, I would love to learn of a better method myself!