sparse file vs. ramfs

Skaperen · 05-17-2011, 05:02 PM

It seems that ramfs will handle sparse files just fine if all you do is write (or not as the case may be) the file data. But as soon as the sparse file is to be read, even the "non-existant" blocks that were never written to (and would not be allocated any space on a filesystem that supports sparse files), will come to occupy space once read. But once the file is unlinked completely, the space is no longer utilized.

So it seems that ramfs does "support" a sparse file in that there is enough of a data structure (which is essentially the system cache, since it's not a real filesystem) to know which blocks occupy space and which do not. It can can free them (it does when the file is unlinked).

My concern is why does it need to retain a cached block that was never written, but does get read? I understand that the mechanism is basically waiting for a backing store (that will never be ready). But this can still be purged because it does that when the file is unlinked. Then all the blocks are no longer structured in the cache (the data may still be residual in RAM, but nothing has it noted as needing to be written).

IMHO, what it should do, if the block was never written, but does get read, is fill in the buffer of binary zeros for the read operation, then discard the block again so it continues to not occupy space.

I'm wondering how much of ordinary buffer cache for real file systems is taking up RAM, at least temporarily, just because a buffer of zeros was created for a process that read a sparse block of a file.

Try creating a fully sparse file in ramfs (e.g. use the truncate command on a zero length file), and check "du" on it. Then read it and do "du" on it again. Just be sure the size you truncate it to is well smaller than your physical RAM size or you could have a nice little "oops".