ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
i have a question (as the title says) about locking mutex (phtread_t type) in a signal handler function (installed by function signal()) for Linux.
It seems that if the mutex has been previously locked by another thread outside the signal handler function and then the signal handler function tries to lock it, the whole process hangs.
Can someone explain to me what is the problem here?
Yes, you can't reliably call a mutex lock or unlock from a signal handler. The problem is that the signals are asynchronous, which means they could occur while the thread has the mutex locked. This will cause a deadlock, because the signal handler cannot acquire the mutex until the thread resumes, but the thread cannot resume until the signal handler has completed.
If you have code like this inside a signal handler, you either need to rework it to simplify the handler, or have the handler initiate a thread (or signal an existing one) to perform the task.
what do you mean by signaling an existing thread as a solution to the problem above?
what differs actually a signal handler and a normal thread in Linux (i.e. why can't the normal thread in the problem above resumes after the signal handler locks the mutex which have been previously locked by the normal thread)?
what do you mean by signaling an existing thread as a solution to the problem above?
[/QUOTE]
As already described, a signal handler is called asynchronously by the operating system/C library when your program receives a signal (assuming you defined a signal handler for this specific signal and the signal isn't SIGKILL or SIGSTOP). Asynchronously means, that the current execution is interrupted/suspended and execution control is transferred to the signal handler, though some kind of notification mechanism. The original execution is resumed ONLY WHEN the signal handler completed.
To speed up the signal handling and the pause time of your application, the signal handler could just start a new thread which is doing its work and return quite fast. Within this newly created thread, you could then safely try to acquire the mutex without occurrence of any deadlock.
Quote:
Originally Posted by lbdgwgt
what differs actually a signal handler and a normal thread in Linux (i.e. why can't the normal thread in the problem above resumes after the signal handler locks the mutex which have been previously locked by the normal thread)?
Signals are usually used to "signal" the process about some extraordaniary situation (e.g. like div-by-zero or segmentation fault/invalid memory access) or to just interrupt the process (SIGUSR, or SIGINT). Though, it's the design of UNIX-systems and UNIX-based systems to complete the current execution path of the application. Assume you catch a SIGSEGV (segementation fault/invalid memory access), you don't want your usual execution to continue until you resolved the error ...
Yes, you can't reliably call a mutex lock or unlock from a signal handler. The problem is that the signals are asynchronous, which means they could occur while the thread has the mutex locked. This will cause a deadlock, because the signal handler cannot acquire the mutex until the thread resumes, but the thread cannot resume until the signal handler has completed.
There is a problem much worse than that. This additional problem is that a signal handler can try to acquire a mutex while another thread doesn't yet have the mutex, but is in the middle of acquiring it. You'll corrupt the execution of your program. Good luck in trying to debug that.
Quote:
Originally Posted by neonsignal
If you have code like this inside a signal handler, you either need to rework it to simplify the handler, or have the handler initiate a thread (or signal an existing one) to perform the task.
Having the handler initiate a thread won't (always) work either. Your handler might be trying to initiate a thread while another thread is either in the process of being created or in the process of being destroyed. You'll corrupt the execution of your program.
Quote:
Originally Posted by lbdgwgt
what differs actually a signal handler and a normal thread in Linux (i.e. why can't the normal thread in the problem above resumes after the signal handler locks the mutex which have been previously locked by the normal thread)?
Only one thread can have a mutex locked at a time. If a mutex has been previously locked by the normal thread, and that lock is still in place, then the signal handling thread will wait for the normal thread to unlock the mutex.
what do you mean by signaling an existing thread as a solution to the problem above?
This means using one of the async-signal-safe functions such as sem_post to communicate to a thread that the signal has occurred.
Often it is easier to use something like sigwait or sigsuspend so that the thread processing the signals can wait directly on the signal, instead of it being mediated by user code.
As wje_lq points out, none of the pthread functions are async-signal-safe. This is especially dangerous, because they will appear to work 'most' of the time.
Last edited by neonsignal; 02-15-2010 at 10:11 PM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.