What's the best way of disk I/O in cyclic manner?

Mapisto · 08-10-2011, 05:08 PM

Hi all,

These days I write a small testing program the takes network captures(pcaps), finds the tcp connections, and overwrites the IP layer.

From my experience the bottleneck will be disk I/O. I need to read multiple files in cyclic manner, and transmit its data to the network. I'm considering using a pre-allocated dynamic cache using cyclic buffers and multi-threading or the simpler alternative - to use the OS own I/O caching mechanism.

Which one is better? any other suggestions will be appreciated...

Thanks,
Guy

theNbomr · 08-12-2011, 08:09 AM

The conventional wisdom in matters of efficiency & optimization is to not try to outsmart the OS. Perhaps a better approach would be to optimize by using the most appropriate filesystem type, which can make significant differences in some circumstances.

--- rod.

Proud · 08-12-2011, 08:11 AM

Sounds a bit like premature optimisation, can't you write a prototype and benchmark if you have any issue first before you need to improve/drastically change your code or avoid the default OS mechanisms and tunables?

syg00 · 08-12-2011, 08:30 AM

Unless you are doing direct I/O you'll be using the page cache anyway. And how do you propose to affect the I/O to avoid this feared botttle-neck ?.
Why are you re-reading the (disk) data ?. If it doesn't change it'll (probably) still be in the cache - depending on RAM of course. And it it is being updated, it'll probably be in cache as well.
So, all up I wouldn't bother.

Now if it was a (very) limited RAM - say an embedded system - that might be different.

Nominal Animal · 08-12-2011, 05:54 PM

I recommend relying on the kernel for I/O caching. You should use posix_fadvise() to tell the kernel about file access patterns, though. Most importantly, if you know you won't need a part of a file you just read, use posix_fadvise() to tell the kernel to drop it from the cache.

If you have multiple concurrent connections, I recommend using asynchronous I/O for disk access (aio_read()), and nonblocking I/O for the socket side. You'll basically always have the next file block posted as an async read/write (you'll need space for three blocks per connection for best performance), with a signal delivered whenever a block is read and available. A single thread (for example the main process thread itself) can do the socket communication for all connections.

This way the kernel is free to decide the order of disk accesses, and can use the I/O elevator fully. Using posix_fadvise() will tell the kernel which parts of the file should be read (ahead), and which parts can be dropped from the page cache. Your program won't use up precious RAM for internal caching, so the OS has more RAM for page cache, giving it leeway for caching decisions.

On the socket side, you may need to use a memory map (Documentation/networking/packet_mmap.txt, see Wiki) to achieve gigabit rates with small packets (since with small packets the system call overhead becomes a limiting factor). If you map the send buffer (TX, circular) to userspace, you can construct multiple packets to multiple destinations in the buffer, and send them all at once by calling send().

Mapisto · 08-13-2011, 01:35 AM

Hi to all,
Proud: I'm using a big amounts of memory (at least 8GB) so I hope that Linux will allocate big enough disk chaces, and to write a multithreaded solution will take some time...
I'm trying to avoid unnecessary actions and to utilize the system's built-in chance mechanisms (instead of writing my own), for instance, using mmap will spare me unnecessary buffer allocations (like in fread).

Nominal Animal: posix_fadvise() sounds great. I don't think that asynchronous I/O will be good for me, since I'm doing really short calculations (After benchmarks I'll be much smarter). packet_mmap is a great solution, I hope I'll have the time to add this feature to the TcpReplay project.

Thanks for the help,
Guy.