How does slab coloring maximizes the use of Cache Lines or Cache Rows?

pmehtal · 10-11-2011, 06:32 AM

Hi,

I was going through the slab coloring in kernel code and I am able to corelate the use of slab coloring with Cache Memory as follows:

Lets assume that we have a 32 bit system where physical address is of 32 bits. So here the physical address is of 32 bits and this address is split into three different parts i.e. Tag, Index and data. The length of the Tag, Index and data is determined through cache line size and total cache memory. So, lets say that the cache memory is of size 4Mega Bytes and each cache line is of 64 bytes, then data is of 6 bits (log2(cache line)), index is of 16 bits i.e. total bits to represent cache memory i.e. 22 bits minus bits for data, so it becomes 16 bits. And rest if the bits i.e. 10 bits are used for Tag.

The cache memory has has various rows and for cache size of 4Mega Bytes and cache line of 64 Bytes, it has 65536 different rows. And each row has two fields i.e. Tag and data. The data represent the data feteched from the main memory and Tag represent the value that is part of physical address as defined above.

When CPU looks for a memory address, it looks for the cache and while looking into it, it breaks the physical address into three different components, i.e. Tag, Index and Data. The Index field is used to identify the row in the cache and in the direct addressble cache this Index represent the row in the Cache. Once the row is known, the Tag filed of the cahce memory is compared with the Tag field of the Physical Address and if it matches, the Cache hit occured otherwise cache miss occured.

I used the above description to make it clear what I think of Cache Memory.

In Linux Kernel, a software cache uses many slabs and each slab allocates a Page (at least one page) from the Buddy System to allocate objects. And these pages allocated by various slabs might come under the same Index Field of the Physical Address and due to that all slabs are going to use the same cache line and will result in a poor performance because the same cache line is used for objects on the different slabs.

Lets see how can this happen in the above defined Cache System:

The Cache has 65536 rows of 64 bytes each it means 16 bits of Physical Index for rows and 6 bits for cache data. Now say that a page is of size 4K means it uses 12 bits. So it means that it uses the 6 bits of the data and 6 bits of the Index. The rest of the bits (10) of Index represents the number of pages entertained by the same Index, what does that mean? It means that the rest of the bits in the Index represents the number of pages that are going to use the same Cache Row or Cache Line and in our case it is 1024 i.e. 2^10. I have not considered the Tag bit of the Physical Address as of now to compute the number of pages and If I consider that it means there are 1024*1024 pages are going to use the same Cache Row.

So there are so many pages to use the same Cache row, then how the performance can be improved?

The performance can be improved by using the different Cache Rows for objects on different slabs. As we know that each slab starts from a Physical Page Address, It can be done by offseting the first object on the each Page by a multiple of Cache Line Size. If we offset the object on a slab then it is going to use the different Cache Row.

This is what I have learned from the Linux Kernel to maximize the use of Cache Lines but I still have one question i.e. The First Object on each slab gets displaced by an offset and so different slab's first object use the different Cache Lines. So how does it effectively improve the performance because we are targetting the First Object only? Can I say that the First Object has important information i.e. the struct slab_t and thats the reason to acces this information as fast as possible?

Please let me know if I am missing anything here.

Regards,
Pankaj

pmehtal · 10-11-2011, 07:00 AM

Hi Guys,

Please reply to this thread. I have asked one question at the end of this thread to help me digest the idea of slab coloring.

Thanks,
Pankaj

pmehtal · 10-12-2011, 12:17 AM

Hi Guys,

Please respond.

Thanks,
Pankaj

Nylex · 10-12-2011, 12:24 AM

First of all, please be aware that threads that haven't been replied to after 24 and 48 hours will be automatically bumped. Therefore, there's no point in bumping your own thread if you're not providing any more information (and doing so before 24 or 48 hours will stop it being automatically bumped). Also, it might be that people who are able to answer your question aren't online when you post, or simply that nobody is able to answer.

Have you tried asking the kernel developers directly, using one of the mailing lists?

pmehtal · 10-12-2011, 05:13 AM

Thanks for the information. I don't know any Kernel Developer mailing list.
Please let me know the mailing list.

Thanks,
Pankaj

sundialsvcs · 10-12-2011, 08:53 AM

After parsing through a very long and rambling page of text, the answer is simple enough: once you establish the position and size of the first object, all the others necessarily follow suit due to the padding.

The "colors" idea probably came from "colors of the rainbow." Hardware caches view the world as a series of overlapping stripes. The slab allocator consciously wastes memory in this case to give the entries in a particular slab a tendency to occupy a (rainbow) stripe of a particular "color." You would use this basically when DMA is involved and when you knew that the activity would be very fast and intense, thus justifying the waste of memory.

But, yes, please don't post a question like that and then "start yammering for attention." People don't hang out here just waiting with baited breath for a complex technical question to show up, and they especially tend to skip over posts that do not get directly to the point.

You made your posting very difficult and laborious to "parse." The majority of your post is a direct quote from an existing O'Reilly book: therefore you only need to cite the book and the page-number. You finally get around to asking your question in the very last paragraph. I think that you would have answered your own question if you had instead thoughtfully re-read the chapter several times and looked directly at the comments in the kernel source-code itself. (Please do not interpret that last sentence as "RTFM" because that is not my intention.)

Nylex · 10-16-2011, 10:51 AM

FWIW, you can find details of the kernel mailing lists at http://vger.kernel.org/.

listerreg · 01-06-2016, 02:01 PM

Hi,

have you found your answer after these years?

I have found this thread looking for answers for apparently the same questions.

In none of the books about Linux kernel I can find any concrete example. Best what I have found is that this schema only _should_ help in better cache usage.

One note about your deduction:

Quote:

Originally Posted by pmehtal

The rest of the bits (10) of Index represents the number of pages entertained by the same Index, what does that mean? It means that the rest of the bits in the Index represents the number of pages that are going to use the same Cache Row or Cache Line and in our case it is 1024 i.e. 2^10.

As I understand it is not exactly correct. The rest of the index actually represents 10 least significant bits of a total of 20 bits possible pages in 4GB space. So every next page is in different cache block (cache line) until 1025-th page which again is occupying "bottom" of the cache. So slab coloring in this case wouldn't have any impact on cache usage regarding slabs within 1024 page range.

Greetings
listerreg