file systems: external edits to an open file

hydraMax · 05-21-2012, 08:05 PM

I'm looking for some perspective on a particular point that is confusing me: Let's say that, on a Gnu/Linux system:

1. I open file A with fopen(), and I read from it, and I leave the program running (i.e., without closing A). Then...

2. Using a different program (say, a text editor), I edit file A, changing a few words. Then...

3. Back in my original program, I rewind() the open stream for FILE A, and read from it again.

So, after step three, my program will always have read the new, edited version of the file A data, correct? Or is this something that is file system dependent?

The reason I ask is, I seem to be getting different results in this scenario for the same program. In one case, the data file is on an XFS file system, and in the other case, the data file is on an EXT3 file system.

Nominal Animal · 05-21-2012, 08:59 PM

Quote:

Originally Posted by hydraMax

So, after step three, my program will always have read the new, edited version of the file A data, correct?

The simple answer is no. There is no such guarantee. When the changes to the underlying file become visible to a process, varies on multiprocessor architectures.

If you modify the step three to reopen the underlying file, you should always see the new, edited version of the file. (Note that this is recommended anyway, because many editors replace the original file with a new one. Your program will still access the old, already deleted file contents; since it is still open, the data is still on disk, but vanishes when the last open descriptor is closed.)

If the editing process and your process use advisory file locks (see man fcntl()), with the editing program temporarily acquiring a write lock on the file, then your program will see the new data after it re-acquires the file lock. The same applies to file leases, except they are Linux-specific.

However, standard C I/O has internal buffers, and you indicated standard C I/O was used. If you look at the fflush() man page, you see that to be sure you see the current contents of the file (and not just the contents of the standard library buffers), you should fflush() the input file handle before rewinding the file, to discard any cached data. The GNU C library uses the filesystem information to determine the optimum cache size for each file (it tries to use native I/O block size, I believe). This could be a reason why it occurs on one filesystem but not on another. You can use stat -c %o FILE-OR-DIRECTORY to see the native I/O block size.

Finally, the caching done by ext3 is a bit wonky, and it could be the reason why you might be seeing old data on ext3. Just search for ext3 fsync on the web to dive into the matter. I seem to remember that on certain settings it could take a very long time for the modified data to show up on open descriptors; something about the way it does writeback, I think. I've switched to ext4 years ago, so I haven't kept up with ext3 quirks at all; ext4 is faster, besides.

If you think it is useful, I could easily whip up two C99 programs to investigate the issue/effect using low-level I/O?

hydraMax · 05-22-2012, 12:29 AM

@Nominal: Thanks for the detailed response. Very helpful and interesting.

Looking into my code some more, it seems that the problem wasn't that the file contents pointed to by the stream aren't changing (although that may still be a problem), but rather that my program only knows that it has changed when it detects that the mtime (from stat()) has changed. For some reason, the mtime of the open FILE stream does change (after external editing) under the xfs file system, but not under the ext3 file system.

But I think the same principles as you mentioned above still apply. So, does this mean I have to reopen the FILE stream every time I want to see if a file has been modified? I guess that isn't too hard to do, but it feels a bit odd, since (in my case) I need to know whether or not the file has been modified before pretty much every operation that is done in the program. So that is a lot of opening and closing of FILE streams.

Nominal Animal · 05-22-2012, 01:17 PM

Quote:

Originally Posted by hydraMax

Looking into my code some more, it seems that the problem wasn't that the file contents pointed to by the stream aren't changing (although that may still be a problem), but rather that my program only knows that it has changed when it detects that the mtime (from stat()) has changed.

Ah. Why not use the inotify interface to detect the CLOSE_WRITE , MOVE_SELF and DELETE_SELF events for that file? The first requires a re-read, the others a reopen (since someone has replaced the file) followed by a re-read. Note that if the editor removes the original file first, instead of the recommended mechanism of renaming the new file over the old file, there might be a short while the file does not exist; some kind of a retry mechanism for the reopening might be needed.

There is also a small delay in inotify reporting the event. The delay effectively guarantees all processes should see the new state, so in your case the delay might be useful. But, if the changes occur at a high frequency, the delay might become a bottleneck.

Quote:

Originally Posted by hydraMax

For some reason, the mtime of the open FILE stream does change (after external editing) under the xfs file system, but not under the ext3 file system.

I think that's related to the writeback wonkiness on ext3 I mentioned.

Quote:

Originally Posted by hydraMax

But I think the same principles as you mentioned above still apply. So, does this mean I have to reopen the FILE stream every time I want to see if a file has been modified?

No, not really. fflush() I would do prior to the reread, though.

The window between the changes not being visible to other processes is very short, because it is more related to synchronization between CPU cores in kernel than anything else. The page cache is shared between all processes, so the synchronization happens very fast.

Using any kind of synchronization to detect the change point -- inotify, advisory file locks, file leases -- works, because the change notification is done after the synchronization occurs.

Something like using a pipe or a socket to tell the other process the modifications have been completed is racy, because there is no guarantees that synchronization has occurred yet: the pipe or socket may be faster than the kernel synchronization. But, if you couple that with advisory file locks or leases, then the lock or lease correctly handles the short race window, resulting in trustable guarantees. The timescale involved here is very, very short; certainly less than a second.

Quote:

Originally Posted by hydraMax

I need to know whether or not the file has been modified before pretty much every operation that is done in the program. So that is a lot of opening and closing of FILE streams.

On my machine, repeated fopen() on the same file takes about 2 µs: a single process can reopen the same file about half a million times per second. The cost of a fopen() seems neglible to me, considering you work on a file modified by multiple unrelated processes.

There are a few things you might consider. For example, use a generation counter, perhaps at the start of the file. Just use pread() to read it (without affecting the file position or confusing the standard I/O on the same file in any way) to see if the file contents have changed. Other processes modifying the file should only update the counter after the other modifications have been written to the file. The size of the counter is not that important, it is perfectly okay for it to wrap around; if your file is text, you can certainly use an identifier string instead. (The number of unique states tells how many writes a reader can miss while still being guaranteed to notice a new write.)

To avoid the possible race conditions in the use of the generation counter, have all processes take an advisory lock when accessing the file. Readers can access the generation counter at any time (remember, the locks are advisory, not mandatory; they won't block any read or write operations); whenever they see a change, they apply for a read lock (fcntl(fileno(handle), F_SETLKW, {F_RDLCK, SEEK_SET, 0, 0})), and keep it during the re-read (to make sure nobody makes any modifications to the file at the same time). After the re-read, they drop the lock. Writers will simply take a write lock (fcntl(fileno(handle), F_SETLKW, {F_WRLCK, SEEK_SET, 0, 0})), modify the contents, preferably the generation counter last (to avoid the time the readers have to wait to obtain the read lock), then drop the lock. Note that both wait until they acquire the lock, and avoid any race conditions that way.

Using the generation counter and advisory lock scheme you should get pretty much maximum throughput with minimum CPU time used.

If you cannot control the editors, then using inotify is your best best. Or, if you are Linux-only, then you could use file leases to detect when another process starts modifying the file, then repeatedly try to lease the file -- it will not succeed as long as it is open by any other process -- until successful, then reread. Both advisory locks and file leases should be sufficient to guarantee the synchronicity.

Care to describe what your program does in a little more detail? Is it just one, or multiple files? Are the files small or large? Binary or text? All processes forked from the same parent? Can you modify the sources for all processes reading and writing to the file?

hydraMax · 05-23-2012, 12:14 AM

https://frigidcode.com/code/csvfs/

My program translates a data file into a file system. If some external program changes the data file, that is okay with me, but after the data file changes, I need to know it happened so I can change the file system representation.

rknichols · 05-24-2012, 11:00 PM

What you are looking for is a "file alteration monitor". On many distributions (Gentoo included) the package you want is called "gamin". It's fairly easy to set up a socket that will receive events whenever a monitored file or directory is changed. If the package is installed on your system, `man fam` will show details and usage.