LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices


Reply
  Search this Thread
Old 01-16-2018, 12:48 AM   #1
sreyan32
LQ Newbie
 
Registered: Jan 2015
Posts: 15

Rep: Reputation: Disabled
How does the write-straving-read problem work in *nix OSes ?


I am trying to understand I/O Schedulers in Linux, and I keep encountering the term write-starving-read problem.

So far what I can understand is that if there is a section of write requests for a particular sector and there is a read request for a further of sector then the read request may not be serviced since the write requests keep coming up before it.

But I have a couple of questions:
  1. The same thing can happen for write requests. If read requests keep coming in then write requests will eventually be starved. So whats so special about writes starving reads?
  2. What is the meaning of write requests can "stream"? Quoting Robert Love from this book Linux System Programming:
    Code:
    This is in stark contrast to write requests, which (in their default, nonsynchronized state) need not initiate any disk I/O until some time in the future. Thus, from the perspective of a user-space application, write requests stream, unencumbered by the performance of the disk.
  3. What does the I/O Scheduler do when it encounters a block number that is lower than the current one it dispatched? For example, if I know that the read/write head is at block number 50 and then the scheduler encounters a read request for block 30, does the scheduler wait for the head to complete its turn or does it go back to 30?

  4. If reads and writes are handled one after another then how can one type of I/O request starve another?
 
Old 01-16-2018, 04:09 AM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 16,653

Rep: Reputation: 2453Reputation: 2453Reputation: 2453Reputation: 2453Reputation: 2453Reputation: 2453Reputation: 2453Reputation: 2453Reputation: 2453Reputation: 2453Reputation: 2453
Some comments from some-one who has not delved into the code - value them as you will.

Your understanding of how I/O requests are handled seems somewhat amiss - they are not handled as soon as they are issued. They are delayed, and merged (so consecutive sectors are treated as one I/O) and maybe (usually) sorted by priority and/or time. There are also usually separate queues for read and write.
The significant issue for your query is that reads are synchronous (the issuing task takes an interrupt, and waits for the data) and generally small. Writes are asynchronous (no-one cares when it hits the disk really) and generally big. Sometimes really big.
Once a write starts (at the device), it generally can't be interrupted - it just keeps the device busy until it's finished. This is where the reads can get impacted - they just don't get a look-in if there is a big I/O in progress.
Different schedulers handle it differently, and recently multi-queue block schedulers have hit mainline - especially for things like SSDs.

Good luck with your endeavours.

Edit: forgot to mention; blktrace is a really good tool to see the actual commands (and the merging) issued to the device. I haven't looked at it in a while, but it keeps getting updates. Kernel tracing improvements over the last year or two also allow you to see whaat is happening at even a lower level. Should be lots of fun once you know what functions you are interested in.

Last edited by syg00; 01-16-2018 at 04:13 AM.
 
Old 01-16-2018, 08:03 AM   #3
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 9,078
Blog Entries: 4

Rep: Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165
The I/O scheduler also wants to be sure that, if there's a write to the same place that is also to be read, the read will capture the data that was written, not what the data used to be.
 
Old 01-19-2018, 11:13 PM   #4
AwesomeMachine
Senior Member
 
Registered: Jan 2005
Location: USA and Italy
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 4,971

Rep: Reputation: 897Reputation: 897Reputation: 897Reputation: 897Reputation: 897Reputation: 897Reputation: 897
A read request is done in blocks, because the system requires everything NOW. Writes are more leisurely, so they don't need to be done immediately, because the system isn't asking for the data.
 
Old 01-22-2018, 08:44 AM   #5
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 9,078
Blog Entries: 4

Rep: Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165Reputation: 3165
Quote:
Originally Posted by AwesomeMachine View Post
A read request is done in blocks ...
Write requests will be bundled together also when possible.
 
Old 01-23-2018, 03:22 AM   #6
sreyan32
LQ Newbie
 
Registered: Jan 2015
Posts: 15

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by syg00 View Post
Some comments from some-one who has not delved into the code - value them as you will.

Your understanding of how I/O requests are handled seems somewhat amiss - they are not handled as soon as they are issued. They are delayed, and merged (so consecutive sectors are treated as one I/O) and maybe (usually) sorted by priority and/or time. There are also usually separate queues for read and write.
The significant issue for your query is that reads are synchronous (the issuing task takes an interrupt, and waits for the data) and generally small. Writes are asynchronous (no-one cares when it hits the disk really) and generally big. Sometimes really big.
Once a write starts (at the device), it generally can't be interrupted - it just keeps the device busy until it's finished. This is where the reads can get impacted - they just don't get a look-in if there is a big I/O in progress.
Different schedulers handle it differently, and recently multi-queue block schedulers have hit mainline - especially for things like SSDs.

Good luck with your endeavours.

Edit: forgot to mention; blktrace is a really good tool to see the actual commands (and the merging) issued to the device. I haven't looked at it in a while, but it keeps getting updates. Kernel tracing improvements over the last year or two also allow you to see whaat is happening at even a lower level. Should be lots of fun once you know what functions you are interested in.
Tell me something why isn't there something called the "reads-starving writes problem"? Is it because when writes requests are submitted they are usually not expected to complete immediately?
 
  


Reply

Tags
kernel, scheduling


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Read Write access to a iso9660 filesystem..mount a .iso image as read write ceazar123 Linux - Newbie 16 09-01-2010 09:07 AM
Read Write access to a iso9660 filesystem..mount a .iso image as read write ceazar123 Linux - General 2 08-26-2010 03:32 PM
Example share in smb.conf doesn't work - read/write vs read only kleptophobiac Linux - Networking 0 09-01-2004 07:14 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel

All times are GMT -5. The time now is 07:03 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration