LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Why not write to HDD when not busy? (https://www.linuxquestions.org/questions/linux-general-1/why-not-write-to-hdd-when-not-busy-788302/)

Pearlseattle 02-10-2010 02:42 PM

Why not write to HDD when not busy?
 
Hi!

Well, this might be a very stupid question, but I have it "in my throat" since a long time - I finally decided to dare asking it :eek: :


When I write something to HDD Linux (and probably as well all other OSs) keeps it back in RAM and waits until a timeout has been reached or it waits that a specific event happens (e.g. the cached informations that haven't been written yet reaches a certain amount of the total RAM) before that it decides to flush the buffers and effectively write the informations to HDD.

Why?
Wouldn't it be better, when the HDD is idle, to write that stuff immediately instead of waiting e.g. 40 seconds risking that when that timeout is reached the HDD receives at the same time another command to read something different from HDD resulting in a congested HDD?

The logic that I think that Linux uses is more or less:
"Keep the informations in memory until 1) timeout is reached and/or 2) buffer limit is reached and/or 3) a sync command is given".

Wouldn't it be better this way?
"Write bits of informations to HDD whenever the HDD is idle and 1) write expired informations when timeout is reached and/or 2) buffer limit is reached and/or 3) a sync command is given.
When writing to HDD and a read command needs HDD access stop writing unless 1/2/3 occur".

Hope I was able to explain what I mean.
Thx! :hattip:

pixellany 02-10-2010 02:50 PM

Your logic is fine---all I can offer is that it seems to work fine the way it is. The real test is whether the user sees smooth operation.

Quakeboy02 02-10-2010 03:02 PM

Hate to bust your bubble, but there is no low-hanging fruit in the kernel - mostly because the guys who write the drivers aren't high school kids doing it after school. Yes, they're volunteers, but yes (again) they are professionals in their real lives. If you have a question about how the various filesystem types work, then google is your friend. The kernel source is also available. There are also various kernel developer mailing lists.

CoderMan 02-10-2010 04:57 PM

I won't pretend to be an expert on the subject, but I know that certain file systems like XFS make use of this buffering, in conjunction with delayed allocation, to decrease file system fragmentation and improve performance.

Keep in mind that writing data to your hard drive is one of the slowest operations routinely performed on your computer. If I recalled the statistics correctly, it can take 1000x longer or more to write data to a hard drive rather than other sources of memory on your computer. So don't knock the system too hard before you have a good understanding of the technical difficulties involved.

chrism01 02-10-2010 07:19 PM

The general rule of thumb I was taught was that each step away from the cpu is (roughly) an order of magnitude slower to read/write ie x10, so:

cpu = 1
cache = x10 slower
RAM = x10 slower than cache
disk = x10 slower than RAM

tredegar 02-11-2010 05:06 AM

Quote:

Wouldn't it be better, when the HDD is idle,......
It is impossible to determine that the HDD is "idle", at the instant anything decides to write to it.

You can determine that the HDD has been idle (for the last 4 seconds perhaps), but by now you have missed the opportunity to use it whilst no other process was using it.

The very moment you do decide to write to it, some other process could start up and want to use the disk as well, so it is now no longer "idle".

See the problem?

I am happy to leave this stuff to the kernel maintainers.

Pearlseattle 02-11-2010 06:14 AM

Well CoderMan, you're right about the issue with the fragmentation - that's a point I missed.
And tredegar you're right as well mentioning the problem of the current status which has actually already elapsed... .

We'll see! :D

Thank youuu!

H_TeXMeX_H 02-11-2010 11:54 AM

Quote:

Originally Posted by CoderMan (Post 3859655)
I won't pretend to be an expert on the subject, but I know that certain file systems like XFS make use of this buffering, in conjunction with delayed allocation, to decrease file system fragmentation and improve performance.

Yup, XFS can do this if you set the options right. I think recently ext4 does this by default (which is madness, IMO). If the power goes out before it writes it from RAM to HDD ... you can say goodbye to your data.

Quakeboy02 02-11-2010 12:00 PM

Quote:

Originally Posted by tredegar (Post 3860155)
The very moment you do decide to write to it, some other process could start up and want to use the disk as well, so it is now no longer "idle".

See the problem?

Processes do not write directly to the disk, and haven't since DOS when there was only one process running at a time. Processes send read/write requests to the operating system. The operating system decides when and how to service those requests. In some cases, the data for a read is already in memory, so the kernel will supply that. In other cases, the disk has to read, and the kernel places the read request on a queue where it will be prioritized and ordered and eventually sent to the disk. There is also a write queue where write requests are ordered and prioritized to be sent to the disk. Some filesystems have a journal that is written to first, as it is faster, and then the journal is played into the disk when there is time.

Quakeboy02 02-11-2010 12:03 PM

Quote:

Originally Posted by H_TeXMeX_H (Post 3860509)
Yup, XFS can do this if you set the options right. I think recently ext4 does this by default (which is madness, IMO). If the power goes out before it writes it from RAM to HDD ... you can say goodbye to your data.

Let's not forget that memory buffers are flushed to disk every 5 seconds. Sure, you can lose 5 seconds worth of data, but as modern filesystems get trickier (more madness) the chance of losing filesystem integrity decreases. Which means less chance of losing data on a large scale.

H_TeXMeX_H 02-11-2010 12:11 PM

Quote:

Originally Posted by Quakeboy02 (Post 3860516)
Let's not forget that memory buffers are flushed to disk every 5 seconds. Sure, you can lose 5 seconds worth of data, but as modern filesystems get trickier (more madness) the chance of losing filesystem integrity decreases. Which means less chance of losing data on a large scale.

I think with XFS you can even change that option to be longer. Still 5 seconds is a long time ... you could lose at around 50 MB/s ... 250 MB of data.

Quakeboy02 02-11-2010 12:23 PM

Quote:

Originally Posted by H_TeXMeX_H (Post 3860524)
I think with XFS you can even change that option to be longer. Still 5 seconds is a long time ... you could lose at around 50 MB/s ... 250 MB of data.

If your needs are really that critical, you might want to invest in a UPS.

H_TeXMeX_H 02-11-2010 12:57 PM

Quote:

Originally Posted by Quakeboy02 (Post 3860536)
If your needs are really that critical, you might want to invest in a UPS.

I got one ... it's great, especially here where the power is not at all stable.

Pearlseattle 02-11-2010 02:26 PM

Just for information: did you try btrfs? And if yes, did you have any bad experiences using it with a recent kernel?

I gave it a shot for the first time a few months ago and I really liked the "re-balance" & "re-sizing" it did to a raid5 when I added an additional HDD to the 3 HDDs I had initially.

A few weeks ago I therefore decided to use it as root fs for my netbook and until now I had no issues, apart from the fact that (only?) Gentoo does not have (yet?) a fsck that runs when the PC boots (and btrfs doesn't apparently support yet a fsck on mounted filesystems).

Performance seems to be ok, especially when running a non-cached "emerge" (the package manager of Gentoo) which reads a large amount of small files - I have the feeling that when doing that it's faster than xfs, reiserfs (for a long time my favourite, now with btrfs as candidate replacement) and jfs. But I don't have yet numbers available to validate my statement.

Quakeboy02 02-11-2010 02:51 PM

Looks very Oracle-ish. :) But, does it do anything to address the question you posed in your original post?


All times are GMT -5. The time now is 10:23 PM.