LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-10-2011, 05:08 PM   #1
Mapisto
LQ Newbie
 
Registered: Aug 2011
Posts: 2

Rep: Reputation: Disabled
What's the best way of disk I/O in cyclic manner?


Hi all,

These days I write a small testing program the takes network captures(pcaps), finds the tcp connections, and overwrites the IP layer.

From my experience the bottleneck will be disk I/O. I need to read multiple files in cyclic manner, and transmit its data to the network. I'm considering using a pre-allocated dynamic cache using cyclic buffers and multi-threading or the simpler alternative - to use the OS own I/O caching mechanism.

Which one is better? any other suggestions will be appreciated...

Thanks,
Guy
 
Old 08-12-2011, 08:09 AM   #2
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
The conventional wisdom in matters of efficiency & optimization is to not try to outsmart the OS. Perhaps a better approach would be to optimize by using the most appropriate filesystem type, which can make significant differences in some circumstances.

--- rod.
 
Old 08-12-2011, 08:11 AM   #3
Proud
Senior Member
 
Registered: Dec 2002
Location: England
Distribution: Used to use Mandrake/Mandriva
Posts: 2,794

Rep: Reputation: 116Reputation: 116
Sounds a bit like premature optimisation, can't you write a prototype and benchmark if you have any issue first before you need to improve/drastically change your code or avoid the default OS mechanisms and tunables?
 
Old 08-12-2011, 08:30 AM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,130

Rep: Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121
Unless you are doing direct I/O you'll be using the page cache anyway. And how do you propose to affect the I/O to avoid this feared botttle-neck ?.
Why are you re-reading the (disk) data ?. If it doesn't change it'll (probably) still be in the cache - depending on RAM of course. And it it is being updated, it'll probably be in cache as well.
So, all up I wouldn't bother.

Now if it was a (very) limited RAM - say an embedded system - that might be different.
 
Old 08-12-2011, 05:54 PM   #5
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
I recommend relying on the kernel for I/O caching. You should use posix_fadvise() to tell the kernel about file access patterns, though. Most importantly, if you know you won't need a part of a file you just read, use posix_fadvise() to tell the kernel to drop it from the cache.

If you have multiple concurrent connections, I recommend using asynchronous I/O for disk access (aio_read()), and nonblocking I/O for the socket side. You'll basically always have the next file block posted as an async read/write (you'll need space for three blocks per connection for best performance), with a signal delivered whenever a block is read and available. A single thread (for example the main process thread itself) can do the socket communication for all connections.

This way the kernel is free to decide the order of disk accesses, and can use the I/O elevator fully. Using posix_fadvise() will tell the kernel which parts of the file should be read (ahead), and which parts can be dropped from the page cache. Your program won't use up precious RAM for internal caching, so the OS has more RAM for page cache, giving it leeway for caching decisions.

On the socket side, you may need to use a memory map (Documentation/networking/packet_mmap.txt, see Wiki) to achieve gigabit rates with small packets (since with small packets the system call overhead becomes a limiting factor). If you map the send buffer (TX, circular) to userspace, you can construct multiple packets to multiple destinations in the buffer, and send them all at once by calling send().
 
1 members found this post helpful.
Old 08-13-2011, 01:35 AM   #6
Mapisto
LQ Newbie
 
Registered: Aug 2011
Posts: 2

Original Poster
Rep: Reputation: Disabled
Hi to all,
Proud: I'm using a big amounts of memory (at least 8GB) so I hope that Linux will allocate big enough disk chaces, and to write a multithreaded solution will take some time...
I'm trying to avoid unnecessary actions and to utilize the system's built-in chance mechanisms (instead of writing my own), for instance, using mmap will spare me unnecessary buffer allocations (like in fread).

Nominal Animal: posix_fadvise() sounds great. I don't think that asynchronous I/O will be good for me, since I'm doing really short calculations (After benchmarks I'll be much smarter). packet_mmap is a great solution, I hope I'll have the time to add this feature to the TcpReplay project.

Thanks for the help,
Guy.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Cyclic dependency when using extern symbols from libraries Learnit Programming 3 05-06-2010 07:26 PM
[SOLVED] Can I partition in this manner? sundry_50 Linux - Desktop 1 04-26-2010 06:23 AM
Non cyclic switching between keyboard layouts alexbrui Linux - General 8 02-17-2010 12:54 AM
Coding a Cyclic Redundancy Check halfpower Programming 8 10-08-2008 07:18 AM
Restrict a task in timely manner kzr_merchant Programming 3 05-14-2008 12:09 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:46 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration