Published at LXer:
While explaining the new splice() and tee() buffer management system calls [story], Linus Torvalds made reference to some possible future extensions. This included vmsplice(), a system call "to basically do a 'write to the buffer', but using the reference counting and VM traversal to actually fill the buffer." Reviewing the implications of using such a system call lead to a comparison with FreeBSD's ZERO_COPY_SOCKET which uses COW (copy on write).Linus explained that while this may look good on specific benchmarks, it actually introduces extra overhead, "the thing is, the cost of marking things COW is not just the cost of the initial page table invalidate: it's also the cost of the fault eventually when you _do_ write to the page, even if at that point you decide that the page is no longer shared, and the fault can just mark the page writable again." He went on to explain, "The COW approach does generate some really nice benchmark numbers, because the way you benchmark this thing is that you never actually write to the user page in the first place, so you end up having a nice benchmark loop that has to do the TLB invalidate just the _first_ time, and never has to do any work ever again later on." Linus didn't pull any punches when he summarized:"I claim that Mach people (and apparently FreeBSD) are incompetent idiots. Playing games with VM is bad. memory copies are _also_ bad, but quite frankly, memory copies often have _less_ downside than VM games, and bigger caches will only continue to drive that point home."