Fast inter-application communication for academic simulator

sarmadys · 12-01-2010, 05:08 AM

Hello,

I have an academic simulator software and I want to visualize its output at the same time the simulation is happening. However I want to separate visualization and simulation modules. The simulation data will be held in an array of a size around 0.5M and will be read only to visualization software (but updated regularly by simulator).

- In past I have used shared memory to share small variables among two applications. I wanted to know whether members here can suggest better ways to do this.

- TCP/IP adds the option of having the simulator and visualization applications on separate machines but the implementation will be more difficult.

- I have also thought about an abstraction layer which allows to replace the communication/interconnection layer with other methods later (file/network/shared memory/pipe).

I will appreciate if experienced members can delight me with their opinions.

Thank you,
Mac

jiml8 · 12-02-2010, 01:48 PM

You say you have the software. Seems to me your choices are to use the mechanisms the software supports.

Sergei Steshenko · 12-02-2010, 02:50 PM

Quote:

Originally Posted by sarmadys

Hello,

I have an academic simulator software and I want to visualize its output at the same time the simulation is happening. However I want to separate visualization and simulation modules. The simulation data will be held in an array of a size around 0.5M and will be read only to visualization software (but updated regularly by simulator).

- In past I have used shared memory to share small variables among two applications. I wanted to know whether members here can suggest better ways to do this.

- TCP/IP adds the option of having the simulator and visualization applications on separate machines but the implementation will be more difficult.

- I have also thought about an abstraction layer which allows to replace the communication/interconnection layer with other methods later (file/network/shared memory/pipe).

I will appreciate if experienced members can delight me with their opinions.

Thank you,
Mac

So, look up

inter-process communications

- if you want your visualization part to be a separate process.

I am not sure to what extent the word "fast" in the thread name is appropriate here. It is, first of all, ambiguous - because one can talk about latency and throughput.

sarmadys · 12-02-2010, 07:52 PM

Quote:

Originally Posted by Sergei Steshenko

So, look up

inter-process communications

- if you want your visualization part to be a separate process.

I am not sure to what extent the word "fast" in the thread name is appropriate here. It is, first of all, ambiguous - because one can talk about latency and throughput.

Sergei,

Thanks for your comments. You are right, by fast I meant both acceptable latency and also (0.5M*24)/s throughput. Most of the IPC methods do not fit my requirements. Named pipes, web services, sockets etc are not very fit.

The problem here is that we currently use MINGW on Windows and it does not support Shared memory on Windows. I guess from among IPC methods Mapped files are the only candidate I can use on Windows. That needs native Windows API calls which I do not really like.

Anyway it seems I need an abstraction layer which hides platform specific implementation of the communication.

sarmadys · 12-02-2010, 07:54 PM

Quote:

Originally Posted by jiml8

You say you have the software. Seems to me your choices are to use the mechanisms the software supports.

I have developed the software myself. It was previously developed using Java but now it is being converted to C++. I am free to do whatever I want.

jiml8 · 12-02-2010, 11:00 PM

Quote:

Originally Posted by sarmadys

I have developed the software myself. It was previously developed using Java but now it is being converted to C++. I am free to do whatever I want.

OK.

I actually have a system in place that does exactly what you are trying to do, though I'm only dealing with 32K buffers. My system is multi-threaded and multi-process.

One process (written in C) has threads that collect data, process data, and report data. The process data and report data threads, together, do what you want to do. Process data works with N buffers, where N is a configuration item and I normally have it set to about 52 or 53. This means that process data works with the current data and the 51 or 52 previous sets of data (I'm collecting real-time statistics along the way and using it for my maximum likelihood estimator routine).

Report data has the job of shipping the contents of buffers once the processing on them is complete (and never before it is complete). Report data sends the buffers to another process (written in C++) that visualizes the data (which is what you want to do). The visualize process may be located someplace else across the internet, or it may be on the same machine...doesn't matter.

The process data and report data threads, being threads, both access the buffer data from within the same process. The report data thread uses UDP sockets to ship the data to the visualization process.

My development and testing setup has the process data and report data threads running on an old 1 GHz Celeron processor with 768 Megs of RAM. The visualization routine runs on a much bigger and faster machine elsewhere on the LAN, and the LAN is a 100 Mb/sec environment.

With this setup, and my 32K buffers, I can update my visualization display about 40 times/sec, using no data compression when I send the data across the network. The celeron processor is at about 15% utilization during this process, and the network is running at nearly full speed.

This system is doing real-time data collection and I honestly can't tell you how fast it COULD run transferring data because the whole thing winds up waiting on the front end, which is an L-band tuner followed by an A/D and a DSP that does an FFT on the incoming signal. It takes between 10 and 15 milliseconds for each sample to come in, and that sets a hard upper limit on how fast my system will run which is why I'm happy to use an old celeron to do the processing job.

With faster multicore hardware and (if needed) a faster LAN, I'll bet you could meet your throughput requirements easily enough.

Sergei Steshenko · 12-03-2010, 08:40 AM

Quote:

Originally Posted by sarmadys

Sergei,

Thanks for your comments. You are right, by fast I meant both acceptable latency and also (0.5M*24)/s throughput. Most of the IPC methods do not fit my requirements. Named pipes, web services, sockets etc are not very fit.

The problem here is that we currently use MINGW on Windows and it does not support Shared memory on Windows. I guess from among IPC methods Mapped files are the only candidate I can use on Windows. That needs native Windows API calls which I do not really like.

Anyway it seems I need an abstraction layer which hides platform specific implementation of the communication.

I think CYGWIN supports shared memory, so if you can switch - do.
...
Don't know how relevant these are:

http://freshmeat.net/projects/mpe
http://freshmeat.net/projects/pvm
http://freshmeat.net/projects/channel
http://www.zeromq.org/ - this one appears to be a very fast one.