LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   C: Reading past EOF. (https://www.linuxquestions.org/questions/programming-9/c-reading-past-eof-4175476359/)

stf92 09-08-2013 05:41 AM

C: Reading past EOF.
 
Hi: I open a file this way:
Code:


        infile = fopen(argv[1], "rb");
        if(!infile){
                fprintf(stderr, "Cannot open %s\n", argv[1]);
                return 0;
        }
       
        // fileno gets the file descriptor out of stream infile
        fstat(fileno(infile), &filebuf);
        filesize= filebuf.st_size;                // get the file size
        printf("File size is %d decimal.\n", filesize);

Then I begin reading the following way:
Code:

unsigned char chk;
       
for(buf=0; buf<16; buf++){
address= 0;
for(page=0; page<NO_OF_PAGES; page++){       

        // Load flash page buffer
        for(ind=0; ind<PAGE_SIZE; ind++){
                txmit(address, 0x0C);
                fread(&chk, sizeof(chk), 1, infile);
                txmit(_chk[0], 0x2C);      // transmit LSB [1]
                txmit(_chk[1], 0x3C);      // transmit MSB  [0]

//printf("%04x %02x %02x\n", chk, _chk[1], _chk[0]);
                txmit(0x00, 0x7D);
                txmit(0x00, 0x7C);
                //code++;
                address++;
        }

        // Load flash high address and program page
        if(page==0){       
                txmit(buf, 0x1C); 
                txmit(0x00, 0x64);
                txmit(0x00, 0x6C);
                poll();
        }else{
                txmit(0x00, 0x64);
                txmit(0x00, 0x6C);
                poll();
        }

        if(address >= filesize) goto done1;
}
}
done1:

As you can see, only by chance I stop reading at EOF. Put another way: suppose filesize is 1000 words (word = sizeof(unsigned short)). And I read 1013 words, which means I read past EOF. Next time I open the file and again I read the first 1013 words. Do I get the same 1013 words? The answer to this question is of interest to me for the diagnosis of the program, as I get correct results sometimes, and sometimes not. Before modifying the program so it reads exactly filesize words, I would like to know what the program, right now, is doing.

dwhitney67 09-08-2013 06:09 AM

You should always check the return value of fread() to determine if an error (possibly indicating EOF) was detected.

Every time a file is opened, the file pointer is referencing the beginning of the file. You can move the file pointer, without the need of reading the file, using fseek(). You will of course have to know ahead of time the byte offset into the file that you want to go to.

In a previous project I worked on, I stored the next read position within 8 bytes at the beginning of the file. Thus I read the position first, then performed an fseek() afterwards.

P.S. I am confused with the declaration of an unsigned char (chk), and then the usage of _chk (type unknown) in which you access it as if it were an array. But suffice to say, it appears that you are reading one character at a time, not 16-bits (or unsigned short).

stf92 09-08-2013 06:27 AM

Yes, I am. And thanks for your concepts. Now, to strip the program off of these mistakes is what I intend to do. But first, I would like to know what the effect is of reading the file internal buffer past end of the file. Could you tell me?

EDIT: having a problem with X. I'll resume later.

johnsfine 09-08-2013 06:37 AM

The standard for fread explicitly does not tell you what will be in your buffer after an fread that fails due to eof (or for any other reason).

stf92 09-08-2013 06:49 AM

I think this: to simplify, assume logical sectors on the drive are 8192 bytes long. When I read for the first time (in the program above) all 8192 bytes will be in some buffer. If EOF is at offset 7999 in this buffer, there will be 192 spurious chars in the buffer. If I read them now, they will be the same as if I close the file, open it again and repeat same operation, because those bytes come from the fixed disk sector, where they can't change.

dwhitney67 09-08-2013 06:52 AM

Quote:

Originally Posted by stf92 (Post 5024020)
I think this: to simplify, assume logical sectors on the drive are 8192 bytes long. When I read for the first time (in the program above) all 8192 bytes will be in some buffer. If EOF is at offset 7999 in this buffer, there will be 192 spurious chars in the buffer. If I read them now, they will be the same as if I close the file, open it again and repeat same operation, because those bytes come from the fixed disk sector, where they can't change.

The EOF will not be stored in the buffer. fread() reads up to the EOF, but does not include the EOF in your buffer. fread() will report the number of bytes successfully read (e.g. 7999), or report an error using -1. As the developer, if you receive a value of -1 from fread(), it is then your responsibility to examine the value of errno to discern the exact cause of the error.

stf92 09-08-2013 07:12 AM

The program is working, though not all as I'd like. As it writes my flash device, I must be careful with it. Therefor, before changing the program, I must fully realize its workings. OK.

Could we be a bit more ... to the hardware view of the thing. Operations involving hardware, most times read entire blocks. So be sure an fread execution accesess the disk or controller buffer, only if it needs a new cluster (block) or several of them. In the buffer nearer the user, there will also be 8192 chars. Only that the user (program) will notice an error condition if he attemps to read past EOF. I think it's as simple as that.

johnsfine 09-08-2013 07:17 AM

Quote:

Originally Posted by stf92 (Post 5024020)
If I read them now, they will be the same as if I close the file, open it again and repeat same operation,

That is probably true, but foolish to rely on.

Quote:

because those bytes come from the fixed disk sector,
That is very unlikely. Regardless of how files are allocated on disk, actual reading from the file should stop at the logical end of file.

I would expect the user buffer to be unmodified by the failing fread. Since you used the same buffer for a successful fread earlier, I would expect the contents from the last successful fread to remain in the buffer through the subsequent unsuccessful freads. But I would not rely on that, because the standard for fread says you cannot rely on the user buffer contents from an unsuccessful fread.

Quote:

Originally Posted by stf92 (Post 5024027)
So be sure an fread execution accesess the disk or controller buffer, only if it needs a new cluster (block) or several of them. In the buffer nearer the user, there will also be 8192 chars.

I'm not sure what you mean by "controller buffer" or "buffer nearer the user". But the point you seem to be trying to make is not correct.

The Linux kernel has buffers in its own memory space, that (among other uses) are used in situations where the end of a file does not land on a sufficiently aligned boundary. A physical read (from disk) into that kernel buffer will extend past the end of file, but the user requested read will be a memory to memory copy from the kernel buffer into the user buffer and that read will stop at the logical end of file.

fread itself is a function executing in user mode. It has no ability to access the kernel buffer. It can make a request to the kernel to copy directly from the file (kernel buffer) into the final user buffer. Alternately, it can request the kernel to copy from the file to fread's own buffer, then fread can copy from its own buffer to the final user buffer. Either way, the kernel operation copying from the kernel buffer will stop at the end of file.

stf92 09-08-2013 07:23 AM

Your answer's been very to the point, I mean helpful, though it leaves me at indetermination. I had already, before, made up my mind to rewrite the program to see if the indetermination in it goes away. But I have several sources of indertermiantion. To be honest, I have now one less, as I know THERE CAN BE indetermination. Thanks a lot.


All times are GMT -5. The time now is 01:08 PM.