LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   fast file i/o in c (https://www.linuxquestions.org/questions/programming-9/fast-file-i-o-in-c-511976/)

andystanfordjason 12-19-2006 12:11 PM

fast file i/o in c
 
hello, i am currently writing a time critical c program and would like to know if there is a fast way of reading large text files. the text is millions of lines of sets of 3 integers representing a sparse matrix. thanks very much

jim mcnamara 12-19-2006 01:02 PM

call stat to get file size, allocate a buffer and call something like this:
Code:

/* read nbyte from a file  -  read whole file */
ssize_t readall(int fd, void *buf, size_t nbyte){
        ssize_t nread = 0,
          n=0;
        do
        {
                if ((n = read(fd, &((char *)buf)[nread], nbyte - nread)) == -1)
                {
                        if (errno == EINTR)
                                continue;
                        else
                                return (-1);
                }
                if (n == 0)
                        return nread;
                nread += n;
        } while (nread < nbyte);
        return nread;
}
/* malloc an array of pointers, then call this to  assign the array of pointers to start of lines:
  call with
      oldchar='\n'
      newchar=0x0
      assign =1 */

char *alter_char(void *buf,
                const size_t filebytes,
                const int oldchar,
                const int newchar,
                char **records,
                const int assign)
{
        size_t i=0;
        size_t rcnt=1;
        char *p=(char *)buf;

        records[0]=(char *)buf;
        records[rcnt]=NULL;
        for(i=0; i<filebytes; i++,p++)
        {
                if(*p==oldchar)
                {
                        *p=newchar;
                        if(assign)
                        {
                                records[rcnt]=p+1;
                                records[++rcnt]=NULL;
                        }
                }
        }
        return buf;
}

you can optimize or forget the second function if you need to -- it's basically taking the place of fgets().

andystanfordjason 12-19-2006 04:44 PM

thanks, thats really good. if anyone else has any code it too would be much appreciated. cheers

aluser 12-22-2006 09:01 PM

mmap() might give you a small gain.

If you can, you probably want to do some of your mathematical computations in one thread *while* another thread reads data from the file.

Of course, the best way to speed up disk IO is to change your hardware: Use faster disks or raid with striping or both..

But then, if the files aren't that large (only a couple million lines), they'll fit in RAM so you're mostly trying to minimize in-memory copies, not improve disk I/O. mmap() ought to help there.


And the bottom line is, if you haven't profiled the code, do so before you choose places to optimize! Could be that your math takes 98% of the time and I/O is a moot point.

jim mcnamara 12-23-2006 09:45 AM

aluser is correct. profile first, optimize later. Frequently, one function with a particularly poor algorithm can use 40% of elapsed time of the whole process.


All times are GMT -5. The time now is 02:53 AM.