LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-16-2005, 11:26 AM   #1
crosseyedalien
LQ Newbie
 
Registered: Nov 2004
Location: East Windsor, NJ
Distribution: Mandriva
Posts: 18

Rep: Reputation: 0
segmentation fault on return statement


My problem is that I get a segmentation fault on a return statement. Under normal circumstances when I receive a segfault I immediately start checking all of my pointers to find the one that is attempting an illegal access and indeed this procedure will sometimes work when I segfault on a return statement. However, sometimes I can't find any pointers that are going wrong, but in the process of adding if-else statements to check them, the segfault "magically" goes away. Even though the if-else statements found all my pointers to be valid.

Unfortunately, I have to say that this isn't a real uncommon thing when I'm programming. I'm just looking for as many general suggestions as possible so when I come across this problem again I can have another plan of attack after verifying all of my pointers.

In addition, this has happened to me on more than one computer so I don't believe that it is a hardware problem - but who knows.

Thanks.
 
Old 06-16-2005, 11:35 AM   #2
perfect_circle
Senior Member
 
Registered: Oct 2004
Location: Athens, Greece
Distribution: Slackware, arch
Posts: 1,783

Rep: Reputation: 53
Can you post some code?
I'm pretty sure it's something you are doing wrong
 
Old 06-16-2005, 12:18 PM   #3
crosseyedalien
LQ Newbie
 
Registered: Nov 2004
Location: East Windsor, NJ
Distribution: Mandriva
Posts: 18

Original Poster
Rep: Reputation: 0
Towards the bottom of writeTemplateDisplay() there are 2 delwin() calls. When I was getting the segfault
the if statements that check for errors were not present. As soon as I added them the segfault went away.
However, they have never been triggered to show that delwin() failed. Now, the actual segmentation fault
occurred in the function that called this one, but I didn't make any changes to it and the segfault went away. The calling function,
writeConfigurationDisplay() would print the final fprintf(), but then segfault before execution could begin
in the calling function. Now, the really strange thing is, if I erase those error catching if statements at the
end of writeTemplateDisplay() the program will now run without problems. So I am really at a loss to understand
what caused the original segmentation fault.


int writeTemplateDisplay(FILE *log, WINDOW *wptr, int vMid, int hMid) {
DIR *templateDir;
struct dirent *dirEntry;
char filename[201], firstFile[201], buffer[150];
int len, statRes, line = 1, contentLine, delwinRes;
struct stat *statBuf;
mode_t modes;
WINDOW *fileListWin, *fileContentWin;
long int directoryPos[100];
FILE *currentFile;

// erase contents of window and redraw box
weraseAndRebox(log, wptr);

// create two subwindows, one to list template files, the other to display their content
fileListWin = subwin(wptr, LINES-4, 30, 2, 32);
scrollok(fileListWin, 1);
box(fileListWin, ACS_VLINE, ACS_HLINE);

fileContentWin = subwin(wptr, LINES-4, COLS-64, 2, 63);
scrollok(fileContentWin, 1);
box(fileContentWin, ACS_VLINE, ACS_HLINE);

touchwin(wptr);

// open the templates directory
templateDir = opendir("/etc/sinterface/templates");
if (!templateDir) {
fprintf(log, "ERROR - could not open template directory, '/etc/sinterface/templates'.\n");
fflush(log);
return(-1);
}

// save the directory location for fast indexing later.
directoryPos[line-1] = telldir(templateDir);

// read templates directory
fprintf(log, "Reading templates directory...\n");
fflush(log);
dirEntry = readdir(templateDir);
while (dirEntry) {
len = strlen(dirEntry->d_name);
if (len >= 201) {
fprintf(log, "ERROR - Encountered a filename in excess of 200 characters in the template directory.\n");
fflush(log);
return(-2);
}

sprintf(filename, "/etc/sinterface/templates/%s\0", dirEntry->d_name);
fprintf(log, "Found file/directory: '%s'\n", filename);
fflush(log);

statRes = stat(filename, statBuf);
if (statRes != 0) {
fprintf(log, "Could not stat '%s'.\n", filename);
fflush(log);
directoryPos[line-1] = telldir(templateDir);
dirEntry = readdir(templateDir);
continue;
}

modes = statBuf->st_mode;
//if (!S_ISDIR(modes) && S_ISREG(modes)) {
if (S_ISREG(modes)) {
// display the name and save into a data structure
fprintf(log, "\t'%s' is a file\n", filename);
fflush(log);
mvwprintw(fileListWin, line, 1, "%s", dirEntry->d_name);

// copy the first file to the content subwindow
if (line == 1) {
contentLine = 1;
currentFile = fopen(filename, "r");
if (currentFile) {
wattrset(fileContentWin, COLOR_PAIR(7));
while(!feof(currentFile)) {
fgets(buffer, 150, currentFile);
mvwprintw(fileContentWin, contentLine, 1, "%s", buffer);
contentLine++;
}
wattrset(fileContentWin, COLOR_PAIR(3));
} else {
mvwprintw(fileContentWin, 10, 10, "File not readable!");
}
fclose(currentFile);
}

line++;
}

directoryPos[line-1] = telldir(templateDir);
dirEntry = readdir(templateDir);
}

wrefresh(wptr);

sleep(5);

// clean up
closedir(templateDir);
werase(fileListWin);
werase(fileContentWin);

delwinRes = delwin(fileListWin);
// This was added to try to determine where the segfault was occurring
if (delwinRes == ERR) {
// These error catching statement have never been tripped
fprintf(log, "Could not delete subwindow fileListWin.\n");
fflush(log);
}

delwinRes = delwin(fileContentWin);
// This was added to try to determine where the segfault was occurring
if (delwinRes == ERR) {
// These error catching statement have never been tripped
fprintf(log, "Could not delete subwindow fileContentWin.\n");
fflush(log);
}

touchwin(wptr);
weraseAndRebox(log, wptr);
wrefresh(wptr);

return(0);
}

int writeConfigurationDisplay(FILE *log, int dispNo, WINDOW *wptr) {
int horzMid = (int)floor((double)(COLS-31)/2.);
int vertMid = (int)floor((double)(LINES-2)/2.);

switch(dispNo) {
case 1: //Templates
writeTemplateDisplay(log, wptr, vertMid, horzMid);
// Segfault occurs after the above function returns but before execution resumes in
// function that called this one.
break;
case 2:
wattrset(wptr, COLOR_PAIR(4));
mvwprintw(wptr, vertMid, horzMid-14, " ");
mvwprintw(wptr, vertMid, horzMid-14, "not yet supported");
wrefresh(wptr);
break;
case 3:
wattrset(wptr, COLOR_PAIR(4));
mvwprintw(wptr, vertMid, horzMid-14, " ");
mvwprintw(wptr, vertMid, horzMid-14, "not yet supported");
wrefresh(wptr);
break;
case 4:
wattrset(wptr, COLOR_PAIR(4));
mvwprintw(wptr, vertMid, horzMid-14, " ");
mvwprintw(wptr, vertMid, horzMid-11, "not yet supported");
wrefresh(wptr);
break;
case 5:
wattrset(wptr, COLOR_PAIR(4));
mvwprintw(wptr, vertMid, horzMid-14, " ");
mvwprintw(wptr, vertMid, horzMid-13, "not yet supported");
wrefresh(wptr);
break;
case 6: // Help
wattrset(wptr, COLOR_PAIR(4));
mvwprintw(wptr, vertMid, horzMid-14, " ");
mvwprintw(wptr, vertMid, horzMid-11, "not yet supported");
wrefresh(wptr);
break;
}

fprintf(log, "about to leave write configuration display\n");
fflush(log);

return(0);
}
 
Old 06-16-2005, 12:21 PM   #4
jim mcnamara
Member
 
Registered: May 2002
Posts: 964

Rep: Reputation: 36
It means that you trashed the stack - ie., you overwrote the return address.

Simplified version: Stacks grow down in memory

When a function is called the return address of the calling function is pushed on the stack, then the arguments are pushed. Finally the local variables you declare are pushed onto the stack.

So
Code:
void foo(int a)
{
     int i;
}
might look like this on the stack

Code:
return address 4 bytes
int a                 4 bytes
int i                  4 bytes  <- current stack pointer ( SP)
So if you were to write 12 bytes to the address of i
the return address will become corrupted. When you hit the return statement the program goes nuts because it cannot return to the bogus address.
 
Old 06-16-2005, 01:10 PM   #5
crosseyedalien
LQ Newbie
 
Registered: Nov 2004
Location: East Windsor, NJ
Distribution: Mandriva
Posts: 18

Original Poster
Rep: Reputation: 0
Sorry about the messy code posting.

I was sort of hoping that someone would go in the direction of stack corruption.

However, I am still uncertain on something else. If I have 2 functions, fct1() and fct2(),
and fct1() calls fct2(). Can fct2() directly access fct1()'s address space on the stack, or
would fct2() have to write a large chunk of data to it's own address space, large enough
to corrupt fct1()'s return address, in order to corrupt fct1()'s return address? Put another
way, is it possible for fct2() to corrupt fct1()'s return address without corrupting its own?
(NOTE: in the scenario I'm concerned with, fct1() also won't pass the address of any of
its own variables to fct2().)
 
Old 06-16-2005, 03:05 PM   #6
jim mcnamara
Member
 
Registered: May 2002
Posts: 964

Rep: Reputation: 36
The answer is: in C it is possible to corrupt anything you can write to - ie., any memory you have write access to.

For example you could pass a pointer from fct1() to fct2() and have fct2 corrupt the stack up in fct1().

You code is a little hard to read - I'd get into gdb and break on fct1. Then set a break at the last line of fct1() before return. Print the stack. Continue. Print the stack again.

I'm sure you'll see corruption.
 
Old 06-16-2005, 10:17 PM   #7
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
You're definitely corrupting the stack. And (just to make things interesting) the actual failure (segfault on "return") is several steps removed from the actual error (the culprit who actually overwrite the stack).

A couple of strong suggestions:

1. Consider using a tool like "Purify" (commercial) or "Electric Fence" (open source).

2. For quick'n'dirty debugging, consider putting "canaries" around your local variables.
EXAMPLE:
a) Let's say you suspect " writeTemplateDisplay()" might be the culprit

b) Let's say writeTemplateDisplay() declares these local variables:

int writeTemplateDisplay(int arg1, const char *argb)
{
int i, j;
char buf[256];

c) Then try adding these "sentries" ("canaries") and check them periodically:
#define TEST_VAL 0x1234

int writeTemplateDisplay(int arg1, const char *argb)
{
unsigned int c1 = TEST_VAL;
int i, j;
char buf[256];
unsigned int c2 = TEST_VAL;
....
// Do Stuff
if ((c1 != TEST_VAL) || (c2 != TEST_VAL))
abort ()
....
// Do more Stuff
if ((c1 != TEST_VAL) || (c2 != TEST_VAL))
abort ()
....

If you execute the program in the debugger, the "abort" should get you very close to the root cause.

'Hope that helps .. PSM
 
Old 06-17-2005, 11:23 AM   #8
jim mcnamara
Member
 
Registered: May 2002
Posts: 964

Rep: Reputation: 36
stack canaries work fine, but the variables you may want to consider:
Code:
volatile int canary1= TEST_VAL;
Otherwise the compiler may optimize away an actual recheck of the value because it determines that you are not assigning anything to those variables.

In other words the compiler may perform some optimizing trick that undermines the validity of your check.
 
Old 06-17-2005, 11:29 AM   #9
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Good point - thanx for the reminder (I've been working with dopey M/S compilers in full debug mode - where "optimization" is seldom a problem - for too long ;-))

It's also worth emphasizing that the test values need to be allocated from the *stack*: "malloc's", "new's", "static's" and string literals are *not* invited...
 
Old 06-17-2005, 12:23 PM   #10
crosseyedalien
LQ Newbie
 
Registered: Nov 2004
Location: East Windsor, NJ
Distribution: Mandriva
Posts: 18

Original Poster
Rep: Reputation: 0
Thanks for all the feedback. I has been very useful.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
yast segmentation fault, system freezing - nvidia driver at fault? BaltikaTroika SUSE / openSUSE 2 12-02-2005 09:34 AM
Segmentation fault sin-x Slackware 2 01-12-2005 03:01 PM
return statement in functions pantera Programming 2 12-06-2004 06:21 PM
value is lost from a return statement! what?? ludeKing Programming 3 05-30-2004 08:32 PM
Segmentation fault suriyamohan Linux - General 5 10-21-2003 01:37 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:56 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration