Register a domain and help support LQ
 Home Forums HCL Reviews Tutorials Articles Register Search Today's Posts Mark Forums Read
 LinuxQuestions.org Recognizing a (seemingly) random crashes
 User Name Remember Me? Password
 Programming This forum is for all programming questions. The question does not have to be directly related to Linux and any language is fair game.

Notices

 03-20-2007, 08:35 PM #1 g4j31a5 Member   Registered: Sep 2006 Distribution: open SuSE 10.0 Posts: 116 Rep: Recognizing a (seemingly) random crashes My application has a (seemingly) random crashes now and then. Sometimes, I can get it to run 3 days straight without encountering it. But occasionally, when I only left it for a few hours it gets crashed. I don't know why it crashed. I can't even reproduce the crash when I wanted to. How do I trap and debug the application for solving that random crashes? Any idea? Thanks in advance.
 03-21-2007, 07:12 AM #2 nacio LQ Newbie   Registered: Mar 2007 Location: Italy Distribution: Debian Posts: 18 Rep: Are you working in C or C++ under linux? Is the crash a segmentation fault? I assume yes because that's the typical situation that I would expect in this forum, tell me if this is not the case. You should make sure to compile your app giving the -g flag to gcc, and run the command ulimit -c 1024 before launching your application. Next time your program crashes it will leave a file named 'core' containing debug info. You may then run the command gdb core, it will show you where it crashed. Additionally, you may type the command bt inside gdb to see what there was in the stack at the moment of crashing. It will print the names of all functions invoked and the values of their parameters. Hope it helps
 03-21-2007, 11:02 PM #3 g4j31a5 Member   Registered: Sep 2006 Distribution: open SuSE 10.0 Posts: 116 Original Poster Rep: Yes I'm using C++ with SDL library. I don't know whether it's a segmentation faults or not because it freezes on the SDL (or maybe the X server). My application has a loop for drawing and event handler routines and it recognized the ESC button to exit the application. But when it crashes, I can't even exit it by using the ESC button. Unfortunately, the application runs in full screen, so the terminal console didn't even show. Maybe I'll try using the core file. Thanks alot.
 03-22-2007, 07:07 AM #4 nacio LQ Newbie   Registered: Mar 2007 Location: Italy Distribution: Debian Posts: 18 Rep: Yeah, I've also had problems when running SDL applications in full-screen mode. It appears that keyboard input focus is set to no window so keyboard actions do nothing. It's difficult to know wether your application actually exits or not. If your problem is a segmentation fault then it should terminate, otherwise you should suspect about an infinite loop or something alike. I know, it's not easy because you see nothing on screen. Next time it happens try switching to a virtual console by pressing Ctrl-Alt-F1, you should get a working text mode terminal where you can use ps to see if your program is still running. You can also kill your program or the X server (it should automatically restart in most distributions), maybe it gives you control back. Another good advise is to read something about gdb or it's front-end 'ddd'. You will need to set up breakpoints and run step-by-step if you have indeed an infinite loop.
04-15-2007, 11:21 PM   #5
g4j31a5
Member

Registered: Sep 2006
Distribution: open SuSE 10.0
Posts: 116

Original Poster
Rep:
Quote:
 Originally Posted by nacio Yeah, I've also had problems when running SDL applications in full-screen mode. It appears that keyboard input focus is set to no window so keyboard actions do nothing. It's difficult to know wether your application actually exits or not. If your problem is a segmentation fault then it should terminate, otherwise you should suspect about an infinite loop or something alike. I know, it's not easy because you see nothing on screen. Next time it happens try switching to a virtual console by pressing Ctrl-Alt-F1, you should get a working text mode terminal where you can use ps to see if your program is still running. You can also kill your program or the X server (it should automatically restart in most distributions), maybe it gives you control back. Another good advise is to read something about gdb or it's front-end 'ddd'. You will need to set up breakpoints and run step-by-step if you have indeed an infinite loop.
Sorry for the late response. Been busy doing some other issue. True, it should've been terminated if it was a segmentation fault. But then again if it was an infinite loop, the event handler should be working. The crash always occur when it was waiting for an SDL thread to finish. Maybe I'll just have to try creating the core dump file.

Thanks.

 05-06-2007, 10:46 PM #6 g4j31a5 Member   Registered: Sep 2006 Distribution: open SuSE 10.0 Posts: 116 Original Poster Rep: Sorry to bump into this pretty old thread of mine. The reason is, the random crash is still happening. Even worst, now it's occasionally freezes. The crash usually happens inside an SDL_Thread. The thread is for printing with LaTeX and CUPS while the main application does some sort of animation and progress bar. This is what's inside the thread's function: Code: int CMainApp::printNow(void *data) { /// @todo implement me CMainApp *pMyApp=static_cast(data); pMyApp->createTempLatex(); //Create temporary latex file pMyApp->createTempPostScript(); //Create temporary postscript file pMyApp->printPostScriptFile(); //Print with CUPS return 1; } The createTempLatex() is only a function that create a LaTeX file as an output like this: Code: void CMainApp::createTempLatex() { fstream filestr; filestr.open ("./temp-print.tex", fstream::out); fstream<<"\\begin{document}"<
 05-07-2007, 09:29 PM #7 graemef Senior Member   Registered: Nov 2005 Location: Hanoi Distribution: Fedora 13, Ubuntu 10.04 Posts: 2,379 Rep: I notice that you are not checking to see if the system calls are successful. If for some reason the (for example) latex call fails then the temp file may not be generated, and may cause a problem down the line... When you get a segfault which line does it occur in?
 05-08-2007, 10:56 AM #8 nacio LQ Newbie   Registered: Mar 2007 Location: Italy Distribution: Debian Posts: 18 Rep: I agree that checking errors is necessary. Not only the system() calls may fail, but also the creation of the temp file. In his book The unix programming environment, Kernighan says every single error generated by every single call that may fail must be checked. This greatly improves debugging time, and also helps the user in solving problems (eg. disk full, file permissions, etc). BTW, when you wrote: Code: fstream<<"\\begin{document}"<(data); pMyApp->createTempLatex(); //Create temporary latex file #ifdef DEBUG cerr <<"Latex file created\n"; #endif pMyApp->createTempPostScript(); //Create temporary postscript file #ifdef DEBUG cerr <<"PostScript file created\n"; #endif pMyApp->printPostScriptFile(); //Print with CUPS #ifdef DEBUG cerr <<"File printed\n"; #endif return 1; }` When you finish debugging your application, just remove the #define line, it'll be the same as removing all the debug output. This way at last you can know where your applicaton got caught in the infinite loop. Now I'll try to guess something I've experienced segmentation faults when trying to write to a file that hadn't been successfully open. Good luck.
 05-09-2007, 10:54 PM #9 g4j31a5 Member   Registered: Sep 2006 Distribution: open SuSE 10.0 Posts: 116 Original Poster Rep: @nacio Oops, right that one is a typo. It should be filestr. Silly me @graemef Well, I forgot which one but I think it's definitely in one of the lines in the thread function. Yeah, both of you are right. Actually lots more code that didn't got error checked. Mostly because I was sure this won't generate an error (and because I've gotten a little lazy). The ones I did error check were the ones that were changed a lot (eg. the image assets loading for SDL, opening & manipulating data files, eg.) and more error prone. Okay, maybe that would've solved the seg fault issue. But what about if the freezes / hangs? One of the random occurrences was random freezes. I think it's not a memory leak issue because I've tried testing the application for 3 days straight non stop, but the application worked fine. But when my boss tried it, the application freezes even when it was only turned on for a few hours. The animation freezes, the event handler didn't work, the application was just pain dead. What I can't figure out was what triggered the randomness? What I did was the same as what my boss did. Now I am just trying to recreate the crash. Thanks a lot guys.
 05-10-2007, 05:01 PM #10 graemef Senior Member   Registered: Nov 2005 Location: Hanoi Distribution: Fedora 13, Ubuntu 10.04 Posts: 2,379 Rep: Just a thought but what processes are running when it freezes? Can you determine if it is stuck in a system() called process?
 05-13-2007, 09:16 PM #11 g4j31a5 Member   Registered: Sep 2006 Distribution: open SuSE 10.0 Posts: 116 Original Poster Rep: Well, just the other day it was crashed again. Right after I valgrind-ed it, I retried executing it again. But it crashed even when it hasn't entered the main loop yet (or it has entered it but freezed). Just a blank screen. And because the keyboard wasn't working also, I just reboot it win the on/off switch. And after that I retried it again fresh from the new boot up. The application worked for a while but this morning I found it freezed again. This time it wasn't in the system call / another thread. But rather the main GUI animation thread (the GUI with SDL before printing with LaTeX system calls). And as usual, I just reboot it again with on / off switch. So I think the seg faults are always in the printing thread (especially in the LaTeX system calls), but the system freezes can happen pretty much anywhere in the application. BTW, can a freeze like that generate a core dump files?
 05-13-2007, 11:19 PM #12 paulsm4 LQ Guru   Registered: Mar 2004 Distribution: SusE 8.2 Posts: 5,863 Blog Entries: 1 Rep: Actually, a "hang" might be even better for you than a core file. GDB allows you to attach to a live process. One you've attached, you can use "where" to get a traceback and determine exactly where the hang is occurring. Here are two links that explain further: http://www-128.ibm.com/developerwork...ix-strace.html http://www.network-theory.co.uk/docs...cintro_76.html 'Hope that helps .. PSM
 05-16-2007, 03:45 AM #13 g4j31a5 Member   Registered: Sep 2006 Distribution: open SuSE 10.0 Posts: 116 Original Poster Rep: Thanks, will look at those links. I'm still a newb concerning bug hunting like this. Especially in Linux. Thanks a lot.

 Tags c++, sdl

 Thread Tools Search this Thread Search this Thread: Advanced Search

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is Off HTML code is Off Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post esteeven Linux - Networking 7 03-10-2007 08:18 AM dastardly Linux - Desktop 2 02-27-2007 01:30 PM ErrorBound Programming 3 07-28-2006 05:47 AM mustardseed Slackware - Installation 2 03-02-2004 05:25 PM xanthumn Linux - Networking 2 01-21-2003 03:30 AM

All times are GMT -5. The time now is 12:24 AM.

 Contact Us - Advertising Info - Rules - LQ Merchandise - Donations - Contributing Member - LQ Sitemap -
 Advertisement
 My LQ
 Write for LQ LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
 Syndicate Latest Threads   LQ News Twitter: @linuxquestions Facebook: linuxquestions Google+: linuxquestions