I'm trying to figure out why Linux is in general splitting I/O requests to smaller chunks.
Here is my test:
# time dd if=/dev/sdak bs=1024k of=/dev/null iflag=direct
At the same time:
$ iostat -mx dm-10 10
avg-cpu: %user %nice %system %iowait %steal %idle
11.97 0.00 1.79 12.21 0.00 74.03
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdak 0.00 0.00 800.00 0.00 200.00 0.00 512.00 1.77 2.22 0.60 47.86
* I.e. Ė as per iostat the 1024KB operations are being split into 512-sector chunks (i.e. into 256KB). Same happens with 512KB requests too.
* 256KB and smaller donít seem to be split in that way.
* 8MB -> split to 256KB
* 20000MB -> split into ~500KB avg request size. (So maybe it can do more than 256K sometimes...)
* Same observations (I/O is split into smaller requests) with other tools (Oracle orion - i.e. async IO results in same splitting; sar show same results as iostat in terms of request size; confirmed by blktrace too.
That has been tested on Oracle Linux 5.7, RHEL 5.5, RHEL 5.7, Ubuntu 11.10. Sometimes the sizes are different, I tried multipathing devices, local SATA drives, virtual machine disks, mdadm RAID0 (chunk size 1MB, 2MB, 4MB): sometimes the size was a bit different (i.e. 128K instead of 256K) but in all the the 1024K requests were split into smaller chunks.
So my questions are:
Q1. Why does that happen? What does that depend on? (Is it related to hardware, to host bus adapter or it's driver, or is more a Linux-kernel specific behavior)?
Q2. Can this be changed? Can we make it sending full-size 1MB requests to storage?
Here is an article which I think is touching some points probably related to my issue: http://people.redhat.com/msnitzer/docs/io-limits.txt
Still it's not clear to me how can I increase the I/O size and avoid it being fragmented, where are these limits coming from and how can we change them.