LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 03-20-2007, 08:35 PM   #1
g4j31a5
Member
 
Registered: Sep 2006
Distribution: open SuSE 10.0
Posts: 116

Rep: Reputation: 15
Recognizing a (seemingly) random crashes


My application has a (seemingly) random crashes now and then. Sometimes, I can get it to run 3 days straight without encountering it. But occasionally, when I only left it for a few hours it gets crashed. I don't know why it crashed. I can't even reproduce the crash when I wanted to. How do I trap and debug the application for solving that random crashes? Any idea? Thanks in advance.
 
Old 03-21-2007, 07:12 AM   #2
nacio
LQ Newbie
 
Registered: Mar 2007
Location: Italy
Distribution: Debian
Posts: 18

Rep: Reputation: 0
Are you working in C or C++ under linux? Is the crash a segmentation fault? I assume yes because that's the typical situation that I would expect in this forum, tell me if this is not the case.

You should make sure to compile your app giving the -g flag to gcc, and run the command ulimit -c 1024 before launching your application. Next time your program crashes it will leave a file named 'core' containing debug info.

You may then run the command gdb <your executable file> core, it will show you where it crashed. Additionally, you may type the command bt inside gdb to see what there was in the stack at the moment of crashing. It will print the names of all functions invoked and the values of their parameters.

Hope it helps
 
Old 03-21-2007, 11:02 PM   #3
g4j31a5
Member
 
Registered: Sep 2006
Distribution: open SuSE 10.0
Posts: 116

Original Poster
Rep: Reputation: 15
Yes I'm using C++ with SDL library. I don't know whether it's a segmentation faults or not because it freezes on the SDL (or maybe the X server). My application has a loop for drawing and event handler routines and it recognized the ESC button to exit the application. But when it crashes, I can't even exit it by using the ESC button. Unfortunately, the application runs in full screen, so the terminal console didn't even show. Maybe I'll try using the core file. Thanks alot.
 
Old 03-22-2007, 07:07 AM   #4
nacio
LQ Newbie
 
Registered: Mar 2007
Location: Italy
Distribution: Debian
Posts: 18

Rep: Reputation: 0
Yeah, I've also had problems when running SDL applications in full-screen mode. It appears that keyboard input focus is set to no window so keyboard actions do nothing.

It's difficult to know wether your application actually exits or not. If your problem is a segmentation fault then it should terminate, otherwise you should suspect about an infinite loop or something alike. I know, it's not easy because you see nothing on screen.

Next time it happens try switching to a virtual console by pressing Ctrl-Alt-F1, you should get a working text mode terminal where you can use ps to see if your program is still running. You can also kill your program or the X server (it should automatically restart in most distributions), maybe it gives you control back.

Another good advise is to read something about gdb or it's front-end 'ddd'. You will need to set up breakpoints and run step-by-step if you have indeed an infinite loop.
 
Old 04-15-2007, 11:21 PM   #5
g4j31a5
Member
 
Registered: Sep 2006
Distribution: open SuSE 10.0
Posts: 116

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by nacio
Yeah, I've also had problems when running SDL applications in full-screen mode. It appears that keyboard input focus is set to no window so keyboard actions do nothing.

It's difficult to know wether your application actually exits or not. If your problem is a segmentation fault then it should terminate, otherwise you should suspect about an infinite loop or something alike. I know, it's not easy because you see nothing on screen.

Next time it happens try switching to a virtual console by pressing Ctrl-Alt-F1, you should get a working text mode terminal where you can use ps to see if your program is still running. You can also kill your program or the X server (it should automatically restart in most distributions), maybe it gives you control back.

Another good advise is to read something about gdb or it's front-end 'ddd'. You will need to set up breakpoints and run step-by-step if you have indeed an infinite loop.
Sorry for the late response. Been busy doing some other issue. True, it should've been terminated if it was a segmentation fault. But then again if it was an infinite loop, the event handler should be working. The crash always occur when it was waiting for an SDL thread to finish. Maybe I'll just have to try creating the core dump file.

Thanks.
 
Old 05-06-2007, 10:46 PM   #6
g4j31a5
Member
 
Registered: Sep 2006
Distribution: open SuSE 10.0
Posts: 116

Original Poster
Rep: Reputation: 15
Sorry to bump into this pretty old thread of mine. The reason is, the random crash is still happening. Even worst, now it's occasionally freezes.

The crash usually happens inside an SDL_Thread. The thread is for printing with LaTeX and CUPS while the main application does some sort of animation and progress bar. This is what's inside the thread's function:
Code:
int CMainApp::printNow(void *data)
{
    /// @todo implement me
  CMainApp *pMyApp=static_cast<CMainApp*>(data);
  pMyApp->createTempLatex(); //Create temporary latex file
  pMyApp->createTempPostScript(); //Create temporary postscript file
  pMyApp->printPostScriptFile(); //Print with CUPS
  return 1;
}
The createTempLatex() is only a function that create a LaTeX file as an output like this:

Code:
void CMainApp::createTempLatex()
{
  fstream filestr;
  filestr.open ("./temp-print.tex", fstream::out);
  fstream<<"\\begin{document}"<<endl;
  .....  //The body of the LaTeX file
  fstream<<"\\end{document}"<<endl;
  filestr.close();
}
And inside the createTempPostScript() function there's 2 system calls for converting the Latex to a DVI file, and a DVI to PS file like this:

Code:
void CMainApp::createTempPostScript()
{
    system("latex temp-print.tex -halt-on-error");
    system("dvips temp-print -Pcmz -t landscape -o temp-print.ps");
}
And the printPostScriptFile() is for sending the PS file to the CUPS spooling queue for printing.

I don't know what's wrong with it but this thread function randomly generates crashes. Often it worked fine until few printings. But every now and then it will crash. And the crash is always in this printing routine.

The crash itself is not always a segmentation error. This is what usually
happen if the application crashed:
1. Back to terminal with segmentation error
2. Hangs up. The animation and progress bar are not working. The event
handler also not working. It's like the system freezes out.
3. Infinite loop. The event handler, animation, and progress bar are
working. But because the application needed the thread to finish before
going to the next state, it will stay inside the current state (printing
state). This always happens if the thread is still doing its stuffs in one
of the system calls yet SDL_WaitThread() has been called.

Can anybody give me some help here? How do you usually solve this kind of
problem? Thanks in advance.

Last edited by g4j31a5; 05-06-2007 at 10:47 PM.
 
Old 05-07-2007, 09:29 PM   #7
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Rep: Reputation: 148Reputation: 148
I notice that you are not checking to see if the system calls are successful. If for some reason the (for example) latex call fails then the temp file may not be generated, and may cause a problem down the line...
When you get a segfault which line does it occur in?
 
Old 05-08-2007, 10:56 AM   #8
nacio
LQ Newbie
 
Registered: Mar 2007
Location: Italy
Distribution: Debian
Posts: 18

Rep: Reputation: 0
I agree that checking errors is necessary. Not only the system() calls may fail, but also the creation of the temp file.
In his book The unix programming environment, Kernighan says every single error generated by every single call that may fail must be checked. This greatly improves debugging time, and also helps the user in solving problems (eg. disk full, file permissions, etc).

BTW, when you wrote:
Code:
fstream<<"\\begin{document}"<<endl;
didn't you mean filestr instead of fstream?

Another thing that I would do is adding some debug output:
Code:
#define DEBUG

int CMainApp::printNow(void *data)
{
    /// @todo implement me
  CMainApp *pMyApp=static_cast<CMainApp*>(data);
  pMyApp->createTempLatex(); //Create temporary latex file
#ifdef DEBUG
  cerr <<"Latex file created\n";
#endif
  pMyApp->createTempPostScript(); //Create temporary postscript file
#ifdef DEBUG
  cerr <<"PostScript file created\n";
#endif
  pMyApp->printPostScriptFile(); //Print with CUPS
#ifdef DEBUG
  cerr <<"File printed\n";
#endif
  return 1;
}
When you finish debugging your application, just remove the #define line, it'll be the same as removing all the debug output. This way at last you can know where your applicaton got caught in the infinite loop.

Now I'll try to guess something I've experienced segmentation faults when trying to write to a file that hadn't been successfully open.

Good luck.
 
Old 05-09-2007, 10:54 PM   #9
g4j31a5
Member
 
Registered: Sep 2006
Distribution: open SuSE 10.0
Posts: 116

Original Poster
Rep: Reputation: 15
@nacio
Oops, right that one is a typo. It should be filestr. Silly me

@graemef
Well, I forgot which one but I think it's definitely in one of the lines in the thread function.

Yeah, both of you are right. Actually lots more code that didn't got error checked. Mostly because I was sure this won't generate an error (and because I've gotten a little lazy). The ones I did error check were the ones that were changed a lot (eg. the image assets loading for SDL, opening & manipulating data files, eg.) and more error prone.

Okay, maybe that would've solved the seg fault issue. But what about if the freezes / hangs? One of the random occurrences was random freezes. I think it's not a memory leak issue because I've tried testing the application for 3 days straight non stop, but the application worked fine. But when my boss tried it, the application freezes even when it was only turned on for a few hours. The animation freezes, the event handler didn't work, the application was just pain dead. What I can't figure out was what triggered the randomness? What I did was the same as what my boss did. Now I am just trying to recreate the crash.

Thanks a lot guys.
 
Old 05-10-2007, 05:01 PM   #10
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Rep: Reputation: 148Reputation: 148
Just a thought but what processes are running when it freezes? Can you determine if it is stuck in a system() called process?
 
Old 05-13-2007, 09:16 PM   #11
g4j31a5
Member
 
Registered: Sep 2006
Distribution: open SuSE 10.0
Posts: 116

Original Poster
Rep: Reputation: 15
Well, just the other day it was crashed again. Right after I valgrind-ed it, I retried executing it again. But it crashed even when it hasn't entered the main loop yet (or it has entered it but freezed). Just a blank screen. And because the keyboard wasn't working also, I just reboot it win the on/off switch. And after that I retried it again fresh from the new boot up. The application worked for a while but this morning I found it freezed again. This time it wasn't in the system call / another thread. But rather the main GUI animation thread (the GUI with SDL before printing with LaTeX system calls). And as usual, I just reboot it again with on / off switch. So I think the seg faults are always in the printing thread (especially in the LaTeX system calls), but the system freezes can happen pretty much anywhere in the application.

BTW, can a freeze like that generate a core dump files?
 
Old 05-13-2007, 11:19 PM   #12
paulsm4
Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Actually, a "hang" might be even better for you than a core file.

GDB allows you to attach to a live process.

One you've attached, you can use "where" to get a traceback and determine exactly where the hang is occurring.

Here are two links that explain further:

http://www-128.ibm.com/developerwork...ix-strace.html

http://www.network-theory.co.uk/docs...cintro_76.html

'Hope that helps .. PSM
 
Old 05-16-2007, 03:45 AM   #13
g4j31a5
Member
 
Registered: Sep 2006
Distribution: open SuSE 10.0
Posts: 116

Original Poster
Rep: Reputation: 15
Thanks, will look at those links. I'm still a newb concerning bug hunting like this. Especially in Linux. Thanks a lot.
 
  


Reply

Tags
c++, sdl


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
ip of my server changes, seemingly at random esteeven Linux - Networking 7 03-10-2007 08:18 AM
Seemingly Random Freezes dastardly Linux - Desktop 2 02-27-2007 01:30 PM
Seemingly random combination of statements causes segmentation fault (C++) ErrorBound Programming 3 07-28-2006 05:47 AM
9.1 installation freezes at seemingly random points mustardseed Slackware - Installation 2 03-02-2004 05:25 PM
Seemingly random hostname changes xanthumn Linux - Networking 2 01-21-2003 03:30 AM


All times are GMT -5. The time now is 09:31 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration