LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   unix file descriptors versus c FILE pointers (https://www.linuxquestions.org/questions/programming-9/unix-file-descriptors-versus-c-file-pointers-258763/)

nodger 11-24-2004 08:18 AM

unix file descriptors versus c FILE pointers
 
Ive read that the C file access functions (fopen,fseek,etc) use some buffering, wheras the standard unix file descriptors do not. Im thinking of converting some programs that use the starndard C library file pointers to use the descriptors instead. Ive encountered bugs and inconsistoncies that I hope will be eliminated (when porting to FreeBSD), but will it be faster?

jlliagre 11-24-2004 08:47 AM

Probably not, stdio buffering is there to increase I/O performance, not the opposite.

itsme86 11-24-2004 09:26 AM

If you use it correctly it will be faster. The stdio functions call their low-level counterpart. For instance, fopen() will call open(), fread()/fgets()/etc., will call read() and so on.

By working directly with the low level functions you eliminate the stdio overhead of maintaining all the stdio struct information and such.

The low level functions have a buffer too though. For instance, doing a read() on STDIN_FILENO will still buffer input until the user hits ENTER. And then the stdio routines have their own buffer seperate from that one.

itsme86 11-24-2004 09:29 AM

Just to prove I'm not crazy ;)
Code:

#include <stdio.h>

int main(void)
{
  FILE *fp;

  fp = fopen("slappy.foo", "w");
  fputs("You're a slappy foo\n", fp);
  fclose(fp);

  return 0;
}

And when you strace it:
Code:

itsme@dreams:~/C$ strace ./stdio
.
.
.
open("slappy.foo", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fstat64(0x3, 0xbffff7bc)                = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40017000
write(3, "You\'re a slappy foo\n", 20)  = 20
close(3)                                = 0
.
.
.

You can see the stdio functions definitely call the low level functions.

jlliagre 11-24-2004 12:16 PM

I never tell stdio lib function do not call the underlying system calls, which obviously is mandatory to have any I/O to occur.
I was only pointing the fact that these libs are maintaining a buffer that can reduce the number of system calls.
Here's something demonstrating it:
Code:

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>

main()
{
  int fd=open("/var/tmp/f1", O_CREAT | O_RDWR, 0777);
  FILE *fp=fopen("/var/tmp/f2", "w");
  static unsigned char buf[1024];
  int i;
  system("date");
  for(i=0; i<8; i++)
  {
    write(fd, buf, 1024);
  }
  close(fd);
  system("date");
  for(i=0; i<8; i++)
  {
    fwrite(buf, 1, 1024, fp);
  }
  fclose(fp);
  system("date");
}

Code:

$ truss -t write b
Wednesday November 24 19:15:21 CET 2004
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 1024)    = 1024
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 1024)    = 1024
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 1024)    = 1024
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 1024)    = 1024
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 1024)    = 1024
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 1024)    = 1024
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 1024)    = 1024
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 1024)    = 1024
Wednesday November 24 19:15:21 CET 2004
write(4, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192)    = 8192
Wednesday November 24 19:15:21 CET 2004

It is also effectively true that many buffering and caching are in the party between a program and the I/O final destination, whatever the target device.
Depending on what nodger's program is doing as I/Os, it may of may not be beneficial to replace stdio by syscalls, and perhaps going further by using memory mapped / direct I/Os.

itsme86 11-24-2004 12:46 PM

So I guess all we proved is "it depends on the situation". Operations on "large" (bigger than stdio's buffer size) buffers is probably faster using low level I/O, where as operations on smaller buffers can take advantage of the stdio functions.

nodger 11-24-2004 01:12 PM

hmm...well im gonna go ahead with it regardless, because theres so many bugs and inconsistencies [in the stdio library] it aint funny. Im just assuming the lower level functions behave exactly the same on different unices

jlliagre 11-24-2004 02:37 PM

Quote:

theres so many bugs and inconsistencies [in the stdio library] it aint funny
Inconsistencies, probably but this is legacy and fixing them would break a majority of existing C code.
Bugs, I would be interested to know which are those you have discovered.
I assume you are investigating gnu c stdlib implementation, which is neither the least tested nor the last piece of newbie's code ...

Quote:

Im just assuming the lower level functions behave exactly the same on different unices
This is probably true on the surface, but don't forget the standard library goal was precisely designed to hide system implementation discrepancies and features by presenting a common consistent interface.

nodger 11-25-2004 06:36 AM

well under redhat9, I discovered a bug where if you have 2 files open at the same time under certain circumstances fread or fwrite fails (I cant remember which)

The "inconsistency" I discovered is under FreeBSD if you fseek() to a point before the start of the file then try to write some data the data will not be written whereas under Linux it will.

jlliagre 11-25-2004 07:02 AM

Quote:

under redhat9, I discovered a bug where if you have 2 files open at the same time under certain circumstances fread or fwrite fails (I cant remember which)
You'll agree it's too vague to be yet qualified a bug ...

Quote:

The "inconsistency" I discovered is under FreeBSD if you fseek() to a point before the start of the file then try to write some data the data will not be written whereas under Linux it will.
On both systems, the fseek should have returned -1 and set EINVAL, if not then it's a IMHO a libc bug.

If they do as I guess, your library should handle this error situation and not doing any I/O operation after the fseek until the current file pointer position has been fixed by a new fseek. If it does read or write anyway, it is more your library's bug than a stdio inconsistency.


All times are GMT -5. The time now is 07:10 PM.