shared memory between user process and kernel

roydcruz · 01-18-2010, 11:13 PM

I would like to implement an inter-process communication mechanism
between processes. The process can be on different CPUs (say connected
by ethernet).

The standard method for IPC would be the use of UDP sockets. However,
the performance is lower that what we'd like because of the copy
operations in the kernel.

I would therefore like to implement shared memory between user
processes and the kernel. Also, I am not sure what synchronization
mechanism (to wake up the user process when a message is put in
shared memory) is available.

thanks in advance for your help.

cladisch · 01-19-2010, 03:14 AM

Quote:

I would therefore like to implement shared memory between user
processes

linux shared memory

Quote:

Also, I am not sure what synchronization
mechanism (to wake up the user process when a message is put in
shared memory) is available.

http://tldp.org/LDP/tlk/ipc/ipc.html

roydcruz · 01-19-2010, 12:44 PM

Isnt the shared memory described in the chapter shared memory
*between 2 user processes*. I'm looking for shared memory
*between user process and kernel*. thanks much.

sundialsvcs · 01-19-2010, 09:02 PM

You obviously cannot share "memory" by "ethernet."

Whatever you are trying now to do, may I politely suggest that "it's already been done." Instead of spending any more time pursuing your present strategy, do some research to discover the prior work that has already been perfected.

I say this most politely and graciously: "it is a waste of your time to chase something relentlessly, just because it is the first idea that popped into your head. And hey, we've all done it." (Show of hands, please? Thank you, everyone. ... See what I mean?

)

cladisch · 01-20-2010, 07:37 AM

Quote:

I'm looking for shared memory between user process and kernel.

This isn't called inter-process communication because the kernel isn't considered a process.

Sharing memory between a user process and kernel is usually implemented by mapping memory into both the kernel's and the process' address space.

What exactly are you trying to do?

roydcruz · 01-20-2010, 11:40 AM

I have 2 CPUs (with processes running on it) on 2 cards. These 2 cards are connected
via ethernet. I have process-1 running on CPU1 that needs to send a message to process-2
running on CPU2.

Ordinarily the way to do IPC between the 2 processes would be UDP sockets. But in this
approach the message is copied into a kernel buffer. This copying is affecting the performance
of the system.

I therefore want to implement an alternate mechanism where I share memory between user
space and kernel. Now I can use ring buffers to deliver the message between user process
and kernel.

The "mapping memory into kernel and user (process) address space" is exactly what I'm looking
to do. But I dont find this documented anywhere. A pointer to how this mapping can be done
would be greatly appreciated.

cladisch · 01-21-2010, 12:44 AM

Quote:

I have 2 CPUs (with processes running on it) on 2 cards. These 2 cards are connected via ethernet. […]
The "mapping memory into kernel and user (process) address space" is exactly what I'm looking to do.

Mapping memory into several address spaces can be done only on a shared memory bus.

Ethernet typically is not a memory bus. What architecture is this?

sundialsvcs · 01-21-2010, 11:04 AM

Thank you for better explaining what you are trying to do.

(If you considered my previous response to be insulting, then I apologize.)

I seriously believe that the amount of time required to "copy the data into a kernel buffer" cannot be a significant parcel of time vs. the amount of time required for (UDP or TCP) network transport.

There are numerous problems associated with sharing memory between the kernel and the user in this way, although it is sometimes done. The problems are:

Locked pages: All of the virtual memory pages required by the buffer area must be "locked" in memory so that the virtual-memory pager daemon cannot page them out.
Data consistency: if your application program can touch the data that is being accessed, say, by the I/O hardware, at the very instant that it is doing so, then you now have a bug that you'll never be able to troubleshoot because you'll never be able to reproduce it. You also have serious potential issues caused by touching data that is in the process of being received.
Memory access by hardware: In some systems, I/O hardware such as network interface cards cannot access "everywhere" in memory, and cannot assemble or transmit data from multiple dis-contiguous buffers, as it would need to do if headers are coming from one place while data is coming from another (i.e. the locked pages).

I therefore suggest that, "if the Kernel Implementors didn't do it 'that way,' then there must indeed have been a very good reason." I would not advocate pursuing this approach in any serious way.

roydcruz · 01-21-2010, 02:35 PM

> Ethernet typically is not a memory bus. What architecture is this?

I am not trying to share memory between to CPUs.
Just between kernel and userland. In my previous
post I was explaining that the kernel can get a
packet that it needs to deliver to a user process.
If I use shared memory between user land and kernel
I save a copy operation.

Also once I have shared memory, I will need a synchronization
mechanism between user process and kernel. The process has
to block until such time a message needs to be delivered
to it. Semapahore is an example of a resource that a
process blocks on. But I have only used semaphores between
2 process and not "process and kernel".

PS: Sundialsvcs - thanks for pointing out problems with my
approach. We are aware of the problems (I did not know
about the locking part). What I am doing is an experiment
to see if the performance improves. It is possible that
the experiment proves there is no improvement ... but it
also possible that there might be an improvement. thanks.

cladisch · 01-22-2010, 04:23 AM

Quote:

the kernel can get a packet that it needs to deliver to a user process. If I use shared memory between user land and kernel I save a copy operation.

It is unlikely that this copying has any measureable effect on the overall performance.

However, it is possible to do this. Implement mmap() for your device and map the pages of the DMA. Make sure to take care of caching and coherence issues.

Quote:

Also once I have shared memory, I will need a synchronization
mechanism between user process and kernel. The process has
to block until such time a message needs to be delivered
to it.

See chapter 6 of Linux Device Drivers.