Originally Posted by jthjth
What methods exist for communicating between processes? Signals I understand, but these are a bit limiting. I've seen people use tcp/ip sockets, but this seems less than ideal, especially when all the processes are on one machine.
I've seen signals, pipes, named pipes (also known as fifos), UNIX domain sockets (stream or datagram sockets), UDP/IP sockets, TCP/IP sockets, shared memory, memory-mapped files, and plain old files used well/efficiently. There are also higher-level services provided by e.g. desktop environments, but they typically use one of the aforementioned ones.
I believe OpenMPI (an MPI implementation) uses shared memory locally, making it surprisingly efficient even locally. (While you can do better by using shared memory explicitly, you need to be aware of cache effects and especially cacheline ping-pong; most programmers don't care, and OpenMPI ends up being faster/better..)
Originally Posted by jthjth
To extend this a little further, say I have a bunch of computers on an Ethernet network, what methods exist to communicate between processes on different machines?
There are a lot of options. Here are the ones that I'd consider:
- MPI (Message-Passing Interface)
This is very similar to sockets or pipes from the applications point of view, but the actual communication method used can vary from shared memory to TCP/IP. Most importantly, the application always does the same thing, and the interface takes care of the details. Widely used in HPC (High-Performance computing), both over IP and InfiniBand networks.
- TCP or UDP/IP sockets
Raw UDP sockets have very little overhead, and especially with multicast (same data packet directed to a number of recipients) may be a lot more efficient than the alternatives. If the data flow is mostly one-way, and especially if packet loss is preferred to retransmission, UDP/IP may be better than TCP/IP or MPI.
- Plain old files over NFS
When the tasks are separated so that no communication is needed between workers, it may be simpler (and more robust) to use plain old files to supply source data to each worker (as a separate job file), and have the (remote) workers save the results into a new file.
Note that advisory file locking only works reliably if the NFS lock manager is correctly configured. There are a lot of installations where file locking over NFS does not work correctly. This means you'll probably want to use a temporary directory on the same NFS mount to save the files to, then fsync() (to make sure the file contents are "on-disk"), before hard-linking (renaming) the files to the final location. Then you only need to monitor the files in the final location without any other locking. A similar mechanism can be used for "accepting" jobs.
Remote Procedure Calls. Uses TCP or UDP/IP. Bindings and variants exist for a lot of languages.
Berkeley Open Infrastructure for Network Computing is the one used for @home projects.
You can set up your own project, either public or local, and distribute the work to BOINC clients. I believe it uses a custom protocol over TCP/IP.
Again, the lists above are not exhaustive. They are only the ones I've seen used well, and would consider for myself.
My own background is in Physics (Computational Materials Physics, to be precise), and I work with high-performance computing. In my case, Linux clusters running distributed physics and/or chemistry simulations. In this arena, MPI seems to be the best choice for interprocess communication. (Funnily enough, because of algorithmic choices and unawareness of cache-related effects, for most programmers MPI yields a faster parallel simulator than when using threads/shared memory, when running on one computer. This means the overheads in MPI implementations, OpenMPI for example, are very low. Certainly, if you want the same program to be distributable locally (multiple processes on the same machine), and over the network, MPI is the best choice I know of.)
If you are using Linux, I would suggest looking first at MPI for parallel and distributed processing. If you work with e.g. IP-connected microcontrollers, then you might consider using UDP sockets instead. There are some TCP stacks too for microcontrollers, but unless you intend to use a protocol that is implemented on top of TCP, I'd go for UDP instead, and design the communications so that missing packets will be detected and retransmitted if/when necessary. For example, a sensor package transmitting its status every couple of seconds probably should not bother retransmitting its old status. If you are working on a multiplatform client-server application, RPC (or similar, perhaps libevent or SOAP) might suit best.
Hope this helps,