ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hi all, I have a problem with an application I'm workin on.
This app is a SCADA like that use serial port for communication, alsa for alarm, gtk for graphic representation, tcp socket for redundancy of data and mysql for configuration and log info storage. This app frequently crash (usually after 8 - 10 work hours) whitout generating any error msg. I've found a document that describe the behavior of linux signal; some signal generate a core dump, other only terminate the application. I have tried to change the behavior of signal that terminate only the app to generate core dump.
The code open a file with touch that contain the number of the signal catched and then core dump with SIGABRT.
I have tried this code sending signal with kill command and it works ok.
I lanch the app and after 8-10 hours it crash witouh generating any core dump.
Besides SIGINT, what other signals are you catching?
If your app is seg-faulting, then you may want to catch SIGSEGV. But this won't help you track down where the bug is in your app. Also, it is quite possible that core-files are not being generated on your Linux system... because that feature is disabled. To enable the generation of core dumps, take a look at these instructions to see if they apply to your system.
Another thing you could do is spend some time perusing your code for any areas where data can be corrupted by buffer overruns. Look for functions like memcpy(), sprintf(), strcpy(), gets(), etc. The last three functions listed above should never be used; there are other safer equivalents to these. These functions are generally the typical suspects that cause seg-faulting and bus errors.
I catch these signal. My system is correctly configured for core dump, I have tried using kill command with all the below signal. Only SIGPIPE do not generate dump (I think it is ignored).
if (sigaction(SIGINT, &handler, 0) < 0)
printf("sigaction() failed\n");
if (sigaction(SIGHUP, &handler, 0) < 0)
printf("sigaction() failed\n");
if (sigaction(SIGPIPE, &handler, 0) < 0)
printf("sigaction() failed\n");
if (sigaction(SIGALRM, &handler, 0) < 0)
printf("sigaction() failed\n");
if (sigaction(SIGTERM, &handler, 0) < 0)
printf("sigaction() failed\n");
if (sigaction(SIGXCPU, &handler, 0) < 0)
printf("sigaction() failed\n");
if (sigaction(SIGXFSZ, &handler, 0) < 0)
printf("sigaction() failed\n");
if (sigaction(SIGVTALRM, &handler, 0) < 0)
printf("sigaction() failed\n");
if (sigaction(SIGPROF, &handler, 0) < 0)
printf("sigaction() failed\n");
I use this signal "map" for catching signal I'm intereste on (terminate process). My system is SLES 10 SP1.
NAME Default Action Description
SIGHUP terminate process terminal line hangup
SIGINT terminate process interrupt program
SIGQUIT create core image quit program
SIGILL create core image illegal instruction
SIGTRAP create core image trace trap
SIGABRT create core image abort(3) call (formerly SIGIOT)
SIGEMT create core image emulate instruction executed
SIGFPE create core image floating-point exception
SIGKILL terminate process kill program
SIGBUS create core image bus error
SIGSEGV create core image segmentation violation
SIGSYS create core image non-existent system call invoked
SIGPIPE terminate process write on a pipe with no reader
SIGALRM terminate process real-time timer expired
SIGTERM terminate process software termination signal
SIGURG discard signal urgent condition present on
socket
SIGSTOP stop process stop (cannot be caught or
ignored)
SIGTSTP stop process stop signal generated from
keyboard
SIGCONT discard signal continue after stop
SIGCHLD discard signal child status has changed
SIGTTIN stop process background read attempted from
control terminal
SIGTTOU stop process background write attempted to
control terminal
SIGIO discard signal I/O is possible on a descriptor
(see fcntl(2))
SIGXCPU terminate process cpu time limit exceeded (see
setrlimit(2))
SIGXFSZ terminate process file size limit exceeded (see
setrlimit(2))
SIGVTALRM terminate process virtual time alarm (see
setitimer(2))
SIGPROF terminate process profiling timer alarm (see
setitimer(2))
SIGWINCH discard signal Window size change
SIGINFO discard signal status request from keyboard
SIGUSR1 terminate process User defined signal 1
SIGUSR2 terminate process User defined signal 2
If you're interested in figuring out why your program is crashing after 8-9 hours, you've got several choices. Sticking "printf's" in a signal handler is *NOT* necessarily a good choice. Using "gdb" is.
Use valgrind to analyse your binary. It seems that you have some buffer overflow or orphaned pointer problem in your code. That would explain why the code dies after some amount of time. Logical coding problems usually crash much earlier.
I've terminated my gdb session of my app after 28 working hours.
The app crashed with a segmentation fault.
I've run gcore command into gdb to produce core dump and analyze it.
The app stopped working in thread n.10 (gdb number) and with this backtrace:
(gdb) bt
#0 0xb7dc9b09 in g_free () from /opt/gnome/lib/libgtk-x11-2.0.so.0
#1 0x00000025 in ?? ()
#2 0x0000000e in ?? ()
#3 0x10291390 in ?? ()
#4 0xb79011b8 in g_free () from /opt/gnome/lib/libglib-2.0.so.0
#5 0x0fb3fe30 in ?? ()
#6 0x0c3fb610 in ?? ()
#7 0xbfeb89c8 in ?? ()
#8 0xb7896ec7 in g_hash_table_lookup () from /opt/gnome/lib/libglib-2.0.so.0
Backtrace stopped: frame did not save the PC
(gdb)
Seem to be a problem with GTK, maybe double free on a widget or some bad operation on widget hash table. I don't understend where is the problem.
if it runs forever (or 28 hours) in gdb then perhaps the difference is in the compiler switches inherent in compiling for gdb - and whether these automatically initialize data to zeros - b/c then when you compile it w/o the gdb options - the data is not initialized, possibly giving you the bad pointer.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.