Kernel build configuration, help with maximum numba nodes
SlackwareThis Forum is for the discussion of Slackware Linux.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Kernel build configuration, help with maximum numba nodes
I'm building an i7 based machine and I've been looking at kernel optimizations for it. One kernel option that I'm sketchy on and haven't been able to find a good resource for is for MAXIMUM NUMA NODES. The default is 6, but I'm not sure if this is best for my particular hardware or not and I'm not exactly sure what this is all about.
As best I can figure, both the i7 920 processor and the x58 chipset both support numa. I could be wrong. It's not the easiest topic to find laymen information for.
As best I can figure, both the i7 920 processor and the x58 chipset both support numa.
Yes, both chips can be used in a NUMA system.
However, if your mainboard has only one processor (socket), then you have only one memory controller, and you do not have a NUMA system.
Quote:
It's not the easiest topic to find laymen information for.
This is what is confusing me since the memory controller is physically on the cpu with the i7 and with QPI replacing the front side bus system , then a numa enabled processor, with a numa enabled chipset SHOULD be all that's required for NUMA?
Quote:
Coherency Leaps Forward at Intel
CSI is a switched fabric and a natural fit for cache coherent non-uniform memory architectures (ccNUMA). However, simply recycling Intel’s existing MESI protocol and grafting it onto a ccNUMA system is far from efficient. The MESI protocol complements Intel’s older bus-based architecture and elegantly enforces coherency. But in a ccNUMA system, the MESI protocol would send many redundant messages between different nodes, often with unnecessarily high latency. In particular, when a processor requests a cache line that is stored in multiple locations, every location might respond with the data. However, the requesting processor only needs a single copy of the data, so the system is wasting a bit of bandwidth.
Intel's solution to this issue is rather elegant. They adapted the standard MESI protocol to include an additional state, the Forwarding (F) state, and changed the role of the Shared (S) state. In the MESIF protocol, only a single instance of a cache line may be in the F state and that instance is the only one that may be duplicated [3]. Other caches may hold the data, but it will be in the shared state and cannot be copied. In other words, the cache line in the F state is used to respond to any read requests, while the S state cache lines are now silent. This makes the line in the F state a first amongst equals, when responding to snoop requests. By designating a single cache line to respond to requests, coherency traffic is substantially reduced when multiple copies of the data exist.
Accesses from CPU1 to memory1 are fast, as are accesses from CPU2 to memory2.
Accesses from CPU1 to memory2 are slower, as are accesses from CPU2 to memory1.
This is what is meant by "non-uniform".
If you have only one processor, all memory accesses go through the same memory controller and have the same speed, so you do not have NUMA in this case.
On systems where the memory controller is integreated in each CPU, the number of NUMA nodes is the same as the number of processors.
The maximum number of NUMA nodes is a separate configuration option because in other systems, multiple CPUs can share a memory controller.
Thank you for taking the time to post that. I've read about this multiple times, and just kept missing the basic picture. Only after you put it that simply, and then reading the first line from wikipedia did it actually sink into my thick skull what it was really all about.
In the end It doesn't seem to have been a problem FINDING the information, but the PERSON who found the information making use of it.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.