-   Programming (
-   -   is write() with O_DIRECT ACID compliant? (

nuliknol 03-14-2012 03:43 PM

is write() with O_DIRECT ACID compliant?
My database engine writes records of 64 bytes by issuing write() syscall of the entire disk block. The device is opened with O_DIRECT mode. For example third record within a block starts at byte 128 and ends at position 192, when I do an UPDATE the entire disk block (which is by default 512 bytes) is written.

My question is, can I calim ACID compliance if I am writing the record over itself every time UPDATE occurs? Usually database engines do this in 2 steps by writing modified disk block to another (free) place and then updating an index to new block with one (atomic) write immediately after first write returned success. But I am not doing this, I am overwriting current data with new one expecting the write to be successful. Does my method has any potential problems? Is it ACID compliant? What if the hardware writes only half of the block and my record is exactly in the middle? Or does the hardware already does the 2 step write process I described , but at block level, so I don't need to repeat the same in software?

(note: no record is larger than physical disk block (512 bytes by default) and fsync goes after each write(), this is for Linux only)

durval 03-15-2012 10:54 AM


The way I see it, O_DIRECT guarantees nothing of the sort: it only bypasses the system caches, which could be far from enough to guarantee any kind of ACID-like integrity.

On the other hand, calling fsync() after each write does guarantee that all information necessary to re-read all data that has been written to the file is indeed written to disk (both file data *and* filesystem metadata). This should take care of the 'D' ("durability") part of ACID.

Regarding the "A"tomicity, "C"onsistency and "I"solation parts, I don't really think they come into play in the scenario you described above.

Durval Menezes.

All times are GMT -5. The time now is 07:03 AM.