dd if=file1 of=file2 bs=1k PUSHES AVG QUEUE LENGTH UP TO 80 !!!!!
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
dd if=file1 of=file2 bs=1k PUSHES AVG QUEUE LENGTH UP TO 80 !!!!!
Hi all,
I assume the following command would read 1k first and then write it to output and then repeat the cycle to the end.
# dd if=/dev/sda of=/dev/null bs=1k
Because 1k bytes are only 2 sectors on disk, there would only one request to the disk I guess.
Therefore, dd would issue a request to read 1k and then wait for the data and write it to /dev/null and then request a read again.
So there can be only one request in the request queue for the disk at any time, at least I thought.
However, what happened actually was as below.
While dd is using bs of 1k to read from /dev/sda and write to /dev/null,
# watch -n .1 cat /proc/diskstats
shows that there are as many as 80 requests in progress (that is the 3rd number from the end in each line)
I am struggling to understand how there can be 80 requests when there should be only 1 at most.
Interestingly, I have this problem with disks with HP cciss driver but not with scsi driver. With SCSI driver, the queue length is 1 usually.
SCSI disks are used for both tests.
Always a bad start.
Testing to validate (or not as the case may be) your assumptions is commendable.
Try changing the I/O scheduler to NOOP and run your tests again.
However, I strongly believe that with bs=1k, a single read request from dd should be the only IO request to the disk until the disk returns the requested data.
I have confirmed the theory by testing as below.
# dd if=/dev/sda of=/dev/null bs=1G
and then trace it
# strace -p 'pid of dd'
Because it takes some time to read 1G, strace shows that dd is stuck with read for long time and then quickly write to /dev/null and then spend a while again to read and so on.
(This proves that dd processes only one request at a time. But with a request of 1G-read from dd would make MANY smaller requests to the I/O scheduler.
That is why I used bs=1k to make the problem easier to understand)
I just don't know enough how there can be 80 requests in the request queue.
There is absolutely no other apps using the disk except my dd.
No disk swapping. CPU idle 100% before testing.
I/O scheduler can be set per device - "on the fly"; doesn't require a (re-)boot. If you can find a quiet period for the environment you may be able to test.
Quote:
However, I strongly believe that with bs=1k, a single read request from dd should be the only IO request to the disk until the disk returns the requested data.
I have confirmed the theory by testing as below.
# dd if=/dev/sda of=/dev/null bs=1G
and then trace it
# strace -p 'pid of dd'
using a test of 1 Gig blksize to test ("confirm" ... ???) a theory re 1 K blocksize (or 1 sector) is not valid in any sense.
You might be interested in the data provided by blktrace - I found it quite instructive. But I wouldn't use it in anything but a test system.
If dd is a single thread process, then regardless of block-size, it's only going to issue one read() at a time. However, perhaps that request is being broken down in kernel space into smaller amounts, 4K pages sounds plausible, or maybe even individual disk blocks(sectors). Maybe the disk driver always reads a track or cylinder at a time. Does that 80 have any relation to your disk geometry values at all? sectors per track or anything like that?
This is all conjecture on my part, I've not studied Linux internals enough to do much more, just throwing it up there to give you some things to consider.
>> using a test of 1 Gig blksize to test ("confirm" ... ???) a theory re 1 K blocksize (or 1 sector) is not valid in any sense.
By issuing #dd bs=1G and stracing it, I was able to see that dd is single threaded and waits for completion of a read request of 1G before proceeding with a write of 1G.
So I was lead to believe that it would be the same even if bs=1k.
1k read, completed, 1k write, completed, and so on.
Now, for bs=1k, how many requests should I expect to see in the request queue? Just ONE.
But, as far as HP cciss driver is concerned, it goes beyond 1 and as high as 80~81 on my system.
80 doesn't mean anything special but just happens to be the highest number I have been seeing with the test.
Anyone please tell me how this can be explained?
(I have just tested the same thing on a Xen virtualised guest OS and the queue length is between 4~5 while #dd bs=1k is running)
Field 9 -- # of I/Os currently in progress The only field that should go to zero. Incremented as requests are given to appropriate struct request_queue and decremented as they finish.
You're still assuming that One read() call == One request. I don't know whether that assumption is correct or not, but the simplest explanation to the results that you're seeing is that it isn't.
But when it is only a request of 1k read, I assume it would be only one call to disk (assuming that there is no other access to the same device).
Yes, that does sound as if it ought to be the case doesn't it.
Perhaps even though dd is only fetching 1K with the read() the kernel is anticipating further reads and is pre-fetching subsequent sectors for efficiency purposes? But again, I'm only guessing here.
I get the feeling this is something that needs a fairly indepth understanding of the I/O specific parts of the kernel to explain. I'm quite curious about this too now, but I don't think we're going to find an answer here on LQ.
Anyway, any further guesswork on my part isn't really going to add any value from this point on so I'll bow out. Hope you didn't mind me sharing my thought processes on this.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.