LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 02-08-2006, 09:46 PM   #1
hedpe
Member
 
Registered: Jan 2005
Location: Pittsburgh
Distribution: Ubuntu
Posts: 378

Rep: Reputation: 30
what could cause this fwrite() to seg fault?


Hey guys,

I am getting a segmentation fault which is backtraced to this block of code, generated by the fwrite() call.

Code:
    /* Append new data to our temporary file */
    sprintf(filename, "%s.tmp", cl->cam->name);
    fd = fopen(filename, "a");

    if(fd == NULL)
      printf("Error opening file %s: %s", filename, strerror(errno));

    fwrite(data, sizeof(char), cl->in_cnt, fd);
    fclose(fd);
There is no "Error opening file" printed out on the screen, so fd is NOT null.

Here is my backtrace:
Code:
Program received signal SIGSEGV, Segmentation fault.
0x42062d67 in fwrite () from /lib/tls/libc.so.6
#0  0x42062d67 in fwrite () from /lib/tls/libc.so.6
#1  0x0804c042 in cam_data (c=0x804d980, cl=0x80829a0) at caching.c:1355
#2  0x080497cf in parse_command (c=0x804d980, cl=0x80829a0) at caching.c:307
#3  0x080491a2 in check_data (c=0x804d980) at caching.c:144
#4  0x08048e7b in main () at caching.c:52
#5  0x42015704 in __libc_start_main () from /lib/tls/libc.so.6
If any other information would prove to be helpful let me know and I can get it by making it crash again (easily done!)

Thanks!
George
 
Old 02-09-2006, 02:01 AM   #2
paulsm4
Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
1. You want to make sure "data" is either a buffer to writeable memory, or it's a pointer that's correctly initialized to point to a writeable buffer, at the time "fwrite" is called

2. "sizeof(char)", of course, should equal "1"

3. You should also check the value of "cl->in_cnt" when the crash occurs, and make sure that it's smaller than your buffer.

4. What's a "writable buffer"? Any array that you've declared locally, declared statically, or allocated via "malloc()" or "new" (and, of course, have *not* inadvertantly deleted before calling "fwrite()".

Step through the debugger and use the "print" (e.g. "p cl->in_cnt") and "dump memory" (e.g. "x/16 data") when the crash occurs to investigate these points.

'Hope that helps .. PSM

PS:
I assume you declared "fd" as "FILE *fd" ("fp" might have been a better choice; "fd" is generally for numeric "file descriptors" instead of stdio "file pointers").

I also assume that "filename" is a character array long enough to hold the actual filename you generated with "sprintf()".

If either assumption is incorrect, that, too might cause fwrite() to crash...

Last edited by paulsm4; 02-09-2006 at 02:03 AM.
 
Old 02-09-2006, 02:52 AM   #3
hedpe
Member
 
Registered: Jan 2005
Location: Pittsburgh
Distribution: Ubuntu
Posts: 378

Original Poster
Rep: Reputation: 30
thanks for the response paulsm4

I am going to add that extra debugging and print out the memory

unfortunately it takes a little while because the code is running on emulab, and I have to restart the experiment

I wish it would crash locally on my computer, but it only crashes when running on emulab, so i'm not sure of the problem yet

you are correct in assuming fd is decalred as FILE *fd;

filename is definately of sufficient size too

i suspect something is wrong with "data"
 
Old 02-09-2006, 03:10 AM   #4
hedpe
Member
 
Registered: Jan 2005
Location: Pittsburgh
Distribution: Ubuntu
Posts: 378

Original Poster
Rep: Reputation: 30
i get this printed out about a hundred times before my seg fault:

Code:
data: 0x0x8082c5e       cl->in_buf: 0x0x8082c59 cl->in_cnt before: 512  after: 507
after this code mod:
Code:
    printf("data: 0x%p\tcl->in_buf: 0x%p\tcl->in_cnt before: %d", data, cl->in_buf, cl->in_cnt);
    cl->in_cnt = cl->in_cnt - (data - cl->in_buf); /* the number of bytes to write to the file will be less
                                                      after stripping out the "numbytes|" part of the data
                                                      so we recalculate subtracting pointers */
    printf("\tafter: %d\n", cl->in_cnt);
    fflush(stdout);
 
Old 02-11-2006, 12:59 PM   #5
hedpe
Member
 
Registered: Jan 2005
Location: Pittsburgh
Distribution: Ubuntu
Posts: 378

Original Poster
Rep: Reputation: 30
bump

i've ran it in valgrind successfully with no memory leaks or out of bounds errors, i have no clue what the problem is
 
Old 02-11-2006, 01:48 PM   #6
paulsm4
Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Hi -

At first, this seemed like a really simple problem. But clearly, there's more here than superficialy meets the eye.

Modest suggestions:
1. Please add the following instrumentation:

a) Outside of your function (if possible)
Code:
static unsigned long ct = 0;
b) Just before your "fwrite()":
Code:
fprintf (stderr, "data: 0x%x, cl: 0x%x, cl->in_cnt: %d, fp: 0x%x, ct: %d\n",
    data, cl, cl->in_cnt, fp, ct++);   
  fwrite(data, sizeof(char), cl->in_cnt, fp);
c) Just after your "fclose()" (and after every other pointer you "fclose()" or "free()"):
Code:
  fclose (fp);
  fp = NULL;
2. The benefit of "fprintf (stderr)" is that you're using unbuffered I/O. Sometimes, frankly, "printf()/fflush (stdout)", doesn't always print out everything you need to see.

3. The "fp" vs "fd" stuff is just an idiom for differentiating between something you "open()" vs something you "fopen()". Just housekeeping. Please humor me.

4. If possible, it'd be interesting to run the SAME test inside of and outside of emulab (I don't know anything about emulab, so I really don't have any advice here).

5. Finally, see if there's any way to instrument your "data" buffer. Valgrind was an excellent idea. Perhaps you can put "canaries" - "sentinel values" - at the start and end of your data buffer and check them each time you read from/write to your buffer?

Feel free to contact me directly via e-mail, or continue posting to this LQ thread.

Good luck!

Your .. PSM

Last edited by paulsm4; 02-11-2006 at 04:34 PM.
 
Old 02-11-2006, 02:05 PM   #7
hedpe
Member
 
Registered: Jan 2005
Location: Pittsburgh
Distribution: Ubuntu
Posts: 378

Original Poster
Rep: Reputation: 30
thanks for your constant suggestions, they are very helpful

what are you suggesting by the "ct" variable? To count the number of fwrites that complete?
 
Old 02-11-2006, 02:22 PM   #8
paulsm4
Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Yes, exactly. You could also use "ct" to set a conditional breakpoint in gdb (if, for example, it succeeds 99 times, and you don't want to break until just before iteration #100).
 
Old 02-11-2006, 02:53 PM   #9
hedpe
Member
 
Registered: Jan 2005
Location: Pittsburgh
Distribution: Ubuntu
Posts: 378

Original Poster
Rep: Reputation: 30
fprintf is definately helping me see more than i had seen before:
Code:
data: 0x80a5602, cl: 0x80a53e4, cl->in_cnt: 507, fp: 0x83d62f0
data: 0x80a5602, cl: 0x80a53e4, cl->in_cnt: 0, fp: 0x83d62f0
with printfs i never saw an instance where cl->in_cnt was 0, there is a case which I need to look over more carefully, will report back, thanks for the suggestion
 
Old 02-11-2006, 04:37 PM   #10
paulsm4
Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
One other potential "gotcha" - if DOS/Windows is in the mix - is reading/writing binary data without doing an "fopen (myfile, "a+b")".

The "binary" attribute is a no-op on Linux ... but could cause you MUCH grief on DOS or Windows...
 
Old 02-12-2006, 06:09 PM   #11
hedpe
Member
 
Registered: Jan 2005
Location: Pittsburgh
Distribution: Ubuntu
Posts: 378

Original Poster
Rep: Reputation: 30
I found the cause! Well, kind of, I know what exactly its doing now, but i don't know why. We're switching over to a different piece of code, which has new and improved error handling thanks to your suggestion

Code:
[gnychis@caching ~]$ cat /local/logs/caching_startcmd.err
Error opening file camera15.tmp: Permission denied
[gnychis@caching Caching]$ ls -l camera15.tmp
-rw-r--r--    1 gnychis  SensorNets     7163 Feb 12 15:57 camera15.tmp
8O

I am actually not sure why it gets permission denied at all yet...

here is the code surrounding the error:
Code:
    /* Append new data to our temporary file */
    sprintf(filename, "%s.tmp", cl->cam->name);
    fp = fopen(filename, "a");

    if(fp == NULL) {
      fprintf(stderr, "Error opening file %s: %s", filename, strerror(errno));
      exit(-1);
    }

    fwrite(data, 1, cl->in_cnt, fp);
    fclose(fp);
    fp = NULL;
That error happened after the code ran for about 15 minutes, and there were about 15 other .tmp files being created and modified without a problem, so its not something that just happened quickly, it works at times. The same process creates the file also. There are also no race conditions for files because it is not a threading program.

if it were a disk space error the error would have returned out of space, correct?
 
Old 02-12-2006, 11:02 PM   #12
hedpe
Member
 
Registered: Jan 2005
Location: Pittsburgh
Distribution: Ubuntu
Posts: 378

Original Poster
Rep: Reputation: 30
it turns out it was because my files were on NFS and i suppose during some sort of NFS packet loss or data inconsistency problem, it was seg faulting on fread() fwrite() and fclose()'s ... oh well!

I'm doing the experiments in /tmp instead :-P
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
C seg fault drigz Programming 5 10-01-2004 04:35 PM
xawtv seg fault lackluster Linux - Hardware 8 08-18-2003 01:12 AM
gnome_error_dialog seg fault?? Castro Programming 3 06-11-2003 02:28 PM
gnome_app_create_menus seg fault in RH 7.3 Castro Programming 0 05-30-2003 09:46 PM
seg fault in terminal Anjo Linux - Software 1 02-07-2003 11:16 AM


All times are GMT -5. The time now is 10:01 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration