You are quite-possibly exceeding the ability of the hardware to move packets in-and-out. There are too many CPUs simultaneously trying to queue and to handle work ... more than the physical hardware can actually do. Therefore, they are "Ferraris stuck in traffic on a two-lane highway."
A spin-lock should actually "spin" only briefly and only occasionally. If you find a significant amount of time being spent "spinning," realize that this time is also "100% wasted."
You only need to dedicate enough CPUs to this task to handle "1 gigabit per second." You accomplish nothing of value by dedicating more CPUs to the task such that they merely wait for one another, especially given that they are literally-wasting time in a spin-lock when they should be finding something else to do. (Other important work might not be being done, because everyone's spinning their wheels.)
You might well find that only one CPU is actually needed to marshal the I/O requests. (Or, one CPU per physical NIC.) The others don't need to be concerned with it at all. You can use "affinity" rules to distribute computing resources among competing priorities.
Last edited by sundialsvcs; 10-11-2017 at 09:48 AM.
|