shared write on local file

Jerry Mcguire · 11-04-2009, 01:54 AM

Hi all,

Maybe you have answered a thousand times, still please help.

I am developing an application involving several programs on Linux which share-write to a local file, i.e. Multi-writers, perhaps some readers. I have two very basic questions about the file read-write operations:

1) when the writers and readers are working on high speed operations, how to make sure the reader either read a full record, or read nothing? Ok, each record has a header that tells the number of bytes in the record. Is the following ok?

typedef struct rec_tag {
int iLen; // entire rec length including this 'int'
char buffer[65536]; //max 64KB, arbitrary
} rec_t;

rec_t rec;

/*writer.cc------------------------------------------*/
int fd = open( "mylist", O_RDWR|O_CREAT|O_APPEND|O_SYNC, 00664 );
...
while (1) {
...
write( fd, &rec, sizeof(rec.iLen)+rec.iLen ); // write it in one-go.

/*reader.cc------------------------------------------*/
int fd = open( "mylist", O_RDONLY );
...
while (1) {
int xx = read( fd, &rec.iLen, sizeof(rec.iLen) );
if (xx == sizeof(rec.iLen)) {
int yy = read( fd, rec.buffer, rec.iLen );
...

Let's just assume sizeof(rec.iLen) is 4 for simplicity.
Would xx either be 0 (nothing read), 4 (good,okay), or -1 (error), and never be something between 0 and 4?
Would also yy even be 0?

/////////////////////////
2) If the above question is clarified, then to guarantee only 1 writer can write to the file at a time, is the following the right thing to do?

/*multiwriter.cc---------------------*/
int fd = open( "alerts", O_RDWR|O_CREAT|O_APPEND|O_SYNC, 00664 );
...
if (flock( fd, LOCK_EX ) == 0) {
// do the write() thing as in the first question.
flock( fd, LOCK_UN );
}
else {
// look for another chance to write
}

/*multireader.cc---------------------*/
int fd = open( "alerts", O_RDONLY );
...
// just read() as in the first question.

Some error checking steps are skipped for illustration purpose. I need to know if I'm doing it right. Please tell. Thanks very much.

Hko · 11-04-2009, 02:58 AM

Quote:

Originally Posted by Jerry Mcguire

1) when the writers and readers are working on high speed operations, how to make sure the reader either read a full record, or read nothing? Ok, each record has a header that tells the number of bytes in the record. Is the following ok?

typedef struct rec_tag {
int iLen; // entire rec length including this 'int'
char buffer[65536]; //max 64KB, arbitrary
} rec_t;

rec_t rec;

That is easier than you seem to think. Reading and writing an arbitrary block using the read() and write() syscalls already is an atomical operation.

So, no locks needed, not even for writing a block, as long as you make sure:

Read() an entire struct at once.
Use O_APPEND on the open() call for adding new structs at the end of the file.
Don't use O_APPEND on the open() call if you want to overwrite struct that already exist in the file.

Then reading or writing the block (struct) either succeds entirely, or fails entirely returning an error (-1).

Note that it is possible that read() "half-succeeds", return less that the block size requested. But in that case there was not an complete block at the end of the file. So in that case the file was corrupt, thus in your case en error too.

I am not sure where you want to use iLen for... You said it is the struct length. But the struct length is always the same: char buffer[65536] is always 65536 bytes, and int iLen is always 4 bytes.

So I wonder why store iLen? If it is to indicate up to where the bytes in buffer[] contain valid data, then it makes sense to me, but in that case why have it include the 4 bytes of the int iLen itself?

If it is meant to first read iLen to know how much bytes to read nexr, it is wrong. Then read()-ing a struct is not atomical anymore (two read operations on one struct is by definition not atomical). Also it is not needed, since the structs are always the same size.

Jerry Mcguire · 11-04-2009, 03:05 AM

Quote:

Originally Posted by Hko

So I wonder why store iLen? If it is to indicate up to where the bytes in buffer[] contain valid data, then it makes sense to me, but in that case why have it include the 4 bytes of the int iLen itself?

If it is meant to first read iLen to know how much bytes to read nexr, it is wrong. Then read()-ing a struct is not atomical anymore (two read operations on one struct is by definition not atomical). Also it is not needed, since the structs are always the same size.

Thanks Hko. Indeed iLen stores the length of valid content following. So having a variable length record it is not possible to pass in a fixed length to read(). What should be done to overcome?

Hko · 11-04-2009, 03:44 AM

Quote:

Originally Posted by Jerry Mcguire

Thanks Hko. Indeed iLen stores the length of valid content following. So having a variable length record it is not possible to pass in a fixed length to read(). What should be done to overcome?

I suggest (and assumed when I wrote my previous post) to read and write entire records/structs. Use iLen to to indicate how many bytes of buffer are meaningful.

So even if you have, say, two meaningful bytes in buffer, read and write all 65535, but set iLen = 2.

This may cause some wasted space on the filesystem, but it is the easiest approach. Otherwise you should use locking, or it will become a complex issue to make sure reads/writes are atomical.

The amount of wasted diskspace can be mitigated, by lseek()-ing forward over the unused bytes in de buffers so the filesystem may be able to make the file(s) sparse (i.e. not realy wasting all the space).

Also you should try to make buffer as small as possible. e.g. if you never going to write more than 2000 bytes in buffer, just don't make it 65536 bytes big.

Jerry Mcguire · 11-04-2009, 04:28 AM

mmm... Please comment if my logic is making sense:

If the file layout contains fixed length records, simple read() and write() operations with the record should suffice, because write() is atomic.

If the file layout contains variable length content, notated as
{ length of content following, actual content }
in my case, then the reader is left with only 2 choices to process the file:

A) as mentioned earlier, do 2 read()'s: one for the length, one for the actual content.

B) do read() with a maximum possible length and process whatever is read until an incomplete content is hanging and repeat this process.

I think A) is better in many ways because it is by all means simpler, and because it can pick up where it left off by saving the lseek(fd,0,SEEK_CUR) offset after each processing of the record. (can't afford to repeat or skip data if anything dies except me).

??

Hko · 11-04-2009, 06:02 AM

Quote:

Originally Posted by Jerry Mcguire

[...]
in my case, then the reader is left with only 2 choices to process the file:

A) as mentioned earlier, do 2 read()'s: one for the length, one for the actual content.

...and hold a lock while doing two read()s...

Quote:

Originally Posted by Jerry Mcguire

B) do read() with a maximum possible length and process whatever is read until an incomplete content is hanging and repeat this process.

Depending on what your data in buffer is actually representing (i.e. is the meaningful chunk of data never more than the buffer size?), IMHO the best option is the first I mentioned: forget about iLen, and write/read blocks of 65536 at once (or what ever buffers size you choose, as long as it is the same all the time).

Jerry Mcguire · 11-04-2009, 07:41 PM

sorry by keep bothering you, please don't get mad with my nagging question marks.
I don't get it, if the writers can only append and never modify any written record, isn't it quite safe for the readers to traverse the file without locks?

If write() is atomic, then a record is always complete in the file as long as write() writes the length and content in one call.

So, if the reader is able to read the length portion, it is guaranteed to read the content portion next, would it not?

ta0kira · 11-04-2009, 08:15 PM

Quote:

Originally Posted by Jerry Mcguire

char buffer[65536]; //max 64KB, arbitrary

Just a side note: you should malloc this because it's quite a bit to put on the stack. Also, if you use stat/fstat, it will tell you the best size increment to read the file in, which is based on the block size of the underlying file system, if you're concerned with efficiency.
Kevin Barry

Hko · 11-14-2009, 06:59 AM

Quote:

Originally Posted by Jerry Mcguire

I don't get it, if the writers can only append and never modify any written record, isn't it quite safe for the readers to traverse the file without locks?

Yes. It is even safe if the writers do change a record that already exists, if writing happens atomically.

Quote:

Originally Posted by Jerry Mcguire

If write() is atomic, then a record is always complete in the file as long as write() writes the length and content in one call.

Yes.

Quote:

Originally Posted by Jerry Mcguire

So, if the reader is able to read the length portion, it is guaranteed to read the content portion next, would it not?

Yes, provided that (again) writing occurs atomically..