Network stack behavior handling packets on multiple cores
Linux - KernelThis forum is for all discussion relating to the Linux kernel.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Network stack behavior handling packets on multiple cores
I am working on a kernel module that sits between a NIC driver and the standard Linux network stack. The module does processing on a per stream (TCP or UDP) basis. My question is:
Is it possible for packets on the same TCP or UDP stream (same IP DA, IP SA, Dst Port, Src Port) to be sent on different CPU cores? I'm trying to decide whether I need to handle/test this case.
If so, how does the kernel ensure that packets do not get out of order?
Packets on the same stream may come from different cores on Linux. In the case of UDP you have no guarantee, as the protocol itself doesn't guarantee you the order. In the case of TCP, the situation is more complicated and using TCP in that way doesn't make much sense. I don't have a multicore machine now to check out for sure, but you have at least a logical order guarantee after assigning sequence numbers (and that's close to the beginning).
On a multi-core or multi-CPU system, any of them can potentially access the hardware at any time. Generally speaking, "network packet requests can come from anywhere at any time and in any order."
Having said that... also note that there are software structures (queues, sockets, and so on) which will have various kinds of serialization and mutual-exclusion mechanisms built into them. So, these mechanisms will work as-designed no matter how many cores, CPUs, processes, and threads might be involved.
As long as you're using a multi-engine aware kernel (and a single-engine kernel will detect and shut down all of the engines that it may find except one), Linux will happily and correctly take advantage of whatever hardware capabilities you have. The presence of multiple engines will make the actual behavior much less deterministic, but software is built to have no dependency upon "determinism."
Last edited by sundialsvcs; 03-19-2010 at 02:42 PM.
Thank you for the replies. So I went back and looked into the kernel stack code. It looks like HARD_TX_LOCK(dev) is used to synchronize access to network device transmit rings. Is this correct?
Since my module essentially overrides the hard_start_xmit() of the NIC driver it looks like, from an ordering standpoint, I am fine. However, my module does do some queueing and sending packets at a later time, so I will probably need to use HARD_TX_LOCK(dev) from within to make sure I behave and synchronize access to the hardware...
Access to the physical device is synchronized, yes.
From your design it looks that you should check if all entries to your module are in synchronized sections. If not, you may need to synchronize on your own, especially if you have your own queues.
Ever since NAPI came to Linux kernel 2.6 and 2.4.19, receive has been made more scalable on SMP systems, wherein order in which the packet came in is retained by upper layer software as well. The packet is left on the hardware and is copied into kernel memory when the poll() function is called; thereafter, netif_receive_skb() is called to send the packet up the stack.
Don't think Tx has changed a whole lot, given that that xmit_lock has been around for a long time.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.