LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   infinite loop on fclose()? (http://www.linuxquestions.org/questions/programming-9/infinite-loop-on-fclose-793478/)

joe2748 03-05-2010 02:43 PM

infinite loop on fclose()?
 
I'm writing some scientific code, and need to process moderately large binary files (>10GB). I am testing a small version on my laptop and noticing bugs that I hadn't seen before handling files. Here is an example program that demonstrates my problem.

Quote:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(int argc,char** argv){
off_t o;
o=(off_t)pow(2,33);
printf("The size of an off_t is %i B\n",sizeof(off_t));
FILE* f = fopen64("fuz","wb");
if (f==NULL){
fprintf(stderr,"Error opening file\n");
exit(1);
}
char data = 'd';
if (fseeko64(f,o,SEEK_SET)!=0){
fprintf(stderr,"Error seeking\n");
exit(1);
}
if (fwrite(&data,sizeof(char),1,f)!=1){
fprintf(stderr,"Error writing\n");
exit(1);
}
printf("Expected file size is %lli\n",o+sizeof(char));
fclose(f);
return (0);
}
If I run the code with o=pow(2,31), I get the following output:

Quote:

joe@joe-laptop:~/research/DG/ccode/v1.3.1$ ./t
The size of an off_t is 8 B
Expected file size is 2147483649
joe@joe-laptop:~/research/DG/ccode/v1.3.1$
, which is what I expect. However, if I set o=pow(2,33) then I get the following output:
Quote:

joe@joe-laptop:~/research/DG/ccode/v1.3.1$ ./t
The size of an off_t is 8 B
Expected file size is 8589934593
^C^C^C^C^Z
The process does not finish, one core stays 100% in use, and as far as I can tell there is not significant hard drive activity. the ^C is me attempting to kill the program after 10 mins. Note that kill -9 t also does not kill the program; I need to RESTART my computer to kill it.

I am running Ubuntu Karmic with kernel:
Linux joe-laptop 2.6.31-19-generic #56-Ubuntu SMP Thu Jan 28 01:26:53 UTC 2010 i686 GNU/Linux

and an ext4 filesystem.

To compile the code I use:
Quote:

gcc t.c -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -g -o t
Does anyone have any idea what the problem is?

BTW: I tried this exact code on a friends laptop which as 2.6.25-something kernel and ext3 filesystem and it worked fine.

Hope someone can help!

nadroj 03-05-2010 03:58 PM

I tried your code after putting this after the "#include"s and before "main"
Code:

#define off_t long
#define fseeko64 fseek
#define fopen64 fopen

and it works fine. It may be possible that it is just taking a while to write to the disk. Do a "fflush" before the "fclose", adding some "fprintf" statements to "stderr" to verify where it is being held up at. My guess is that it will take a while at the "fflush" statement (and therefore it has nothing to do with "fclose" itself, but because it must actually write the "data" to disk). Also, as a benchmark, try and manually (i.e. in the GUI of your OS) copy/paste a large file, i.e. 8GB or similar size.

neonsignal 03-05-2010 04:29 PM

Quote:

Originally Posted by nadroj (Post 3887492)
It may be possible that it is just taking a while to write to the disk.

The write doesn't take long, because only the last block of the file is physically written (the rest is not yet initialized).

I wasn't able to duplicate the problem (on a 2.6.30 kernel and ext3 filesystem); the result was as expected.

nadroj 03-05-2010 04:36 PM

Correct, it makes sense that the 8GB of disk space is reserved, and only the last page/block is electronically written. I do not know the solution, so was giving a "try this".

joe2748 03-05-2010 04:39 PM

I tried the suggestion of using fflush(f). The program is now hung up on the fflush(), not the fclose. However, my first instance of this program has been running for 3 hrs (I don't feel like rebooting), so this cannot (in my opinion) be just a matter of waiting for the system to write the file.

Any other ideas?



Quote:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(int argc,char** argv){

off_t o;
o=(off_t)pow(2,33);
printf("The size of an off_t is %i B\n",sizeof(off_t));
FILE* f = fopen64("fuzbucket","wb");
if (f==NULL){
fprintf(stderr,"Error opening file\n");
exit(1);
}
char data = 'd';
if (fseeko64(f,o,SEEK_SET)!=0){
fprintf(stderr,"Error seeking\n");
exit(1);
}
if (fwrite(&data,sizeof(char),1,f)!=1){
fprintf(stderr,"Error writing\n");
exit(1);
}
printf("Expected file size is %lli\n",o+sizeof(char));
fflush(f);
fprintf(stderr,"passed fflush()\n");
fclose(f);
return (0);
}

joe2748 03-05-2010 04:43 PM

Ah ha!

I am using ecryptfs to encrypt my home directory. If I run the program in /var/tmp ,which in unencrypted, then I get the expected results.

Therefore, this must be an issue for some forum other than programming.

Anyone know which forum I need/or have a solution for me? I'm a bit overprotective of my work, I would like it encrypted if possible.

nadroj 03-05-2010 04:45 PM

I'd recommend to try and find the exact number/file size that makes it hang. Does it work with 4GB, 5GB, 6GB? Do a manual binary search to find the number as quick as possible. After you find it, maybe search around for that file size and other similar keywords (C file I/O, ext4 vs ext3, etc.) If there's some known issue or similar behaviour experience by others, then you should be able to find it that way.


EDIT: I see you found the problem. encryfs has given me problems in the past (not programming problems, though). I think it's still a programming issue you're having, though. Maybe start another thread, specifying the problem FS. However, considering the file is an order of magnitude larger than the one that works (2^31), it might actually just be taking a while to encrypt it and it is in fact working. How long does the 4GB file that worked take? Maybe read up on some technical details/specs of the FS.

joe2748 03-05-2010 04:59 PM

The 4GB file takes a few seconds, literally less than 10 seconds. Also, I see the hard drive light on my laptop come on immediately with the 4 GB file.

with the 8GB file the program can run for hours with no discernible progress and no hd indicator light.

Also, what is the etiquette for starting another thread on the same topic, but with a different title? Can I rename this thread? Can I close this one, then start another?

neonsignal 03-05-2010 05:01 PM

There is no guarantee that the ecryptfs filesystem has the same limits as the underlying one, so this could well be an ecryptfs issue.

paulsm4 03-05-2010 06:16 PM

Hi -

It definitely sounds like eCryptFS might be the culprit here.

Strong suggestion:
Go to the eCryptFS developer page and ask the community about LFS support for an eCryptFS filesystem using the standard library I/O calls (like "fopen64").

http://ecryptfs.sf.net
.. or ..
https://lists.sourceforge.net/lists/...ecryptfs-users
.. or ..
http://ecryptfs.sourceforge.net/

joe2748 03-05-2010 08:09 PM

Thanks for the suggestions!

I will head over there and see what I can find.


All times are GMT -5. The time now is 08:27 PM.