Locking mutex in a signal handler function

lbdgwgt · 02-15-2010, 06:53 AM

Hi all,

i have a question (as the title says) about locking mutex (phtread_t type) in a signal handler function (installed by function signal()) for Linux.

It seems that if the mutex has been previously locked by another thread outside the signal handler function and then the signal handler function tries to lock it, the whole process hangs.

Can someone explain to me what is the problem here?

Thanks for any answer

neonsignal · 02-15-2010, 09:33 AM

Yes, you can't reliably call a mutex lock or unlock from a signal handler. The problem is that the signals are asynchronous, which means they could occur while the thread has the mutex locked. This will cause a deadlock, because the signal handler cannot acquire the mutex until the thread resumes, but the thread cannot resume until the signal handler has completed.

If you have code like this inside a signal handler, you either need to rework it to simplify the handler, or have the handler initiate a thread (or signal an existing one) to perform the task.

lbdgwgt · 02-15-2010, 10:04 AM

Hi,

thanks for the anwer. i have further questions:

what do you mean by signaling an existing thread as a solution to the problem above?

what differs actually a signal handler and a normal thread in Linux (i.e. why can't the normal thread in the problem above resumes after the signal handler locks the mutex which have been previously locked by the normal thread)?

ForzaItalia2006 · 02-15-2010, 12:07 PM

what do you mean by signaling an existing thread as a solution to the problem above?
[/QUOTE]

As already described, a signal handler is called asynchronously by the operating system/C library when your program receives a signal (assuming you defined a signal handler for this specific signal and the signal isn't SIGKILL or SIGSTOP). Asynchronously means, that the current execution is interrupted/suspended and execution control is transferred to the signal handler, though some kind of notification mechanism. The original execution is resumed ONLY WHEN the signal handler completed.

To speed up the signal handling and the pause time of your application, the signal handler could just start a new thread which is doing its work and return quite fast. Within this newly created thread, you could then safely try to acquire the mutex without occurrence of any deadlock.

Quote:

Originally Posted by lbdgwgt

what differs actually a signal handler and a normal thread in Linux (i.e. why can't the normal thread in the problem above resumes after the signal handler locks the mutex which have been previously locked by the normal thread)?

Signals are usually used to "signal" the process about some extraordaniary situation (e.g. like div-by-zero or segmentation fault/invalid memory access) or to just interrupt the process (SIGUSR, or SIGINT). Though, it's the design of UNIX-systems and UNIX-based systems to complete the current execution path of the application. Assume you catch a SIGSEGV (segementation fault/invalid memory access), you don't want your usual execution to continue until you resolved the error ...

Hope that helps,
- Andi -

wje_lq · 02-15-2010, 08:45 PM

Quote:

Originally Posted by neonsignal

Yes, you can't reliably call a mutex lock or unlock from a signal handler. The problem is that the signals are asynchronous, which means they could occur while the thread has the mutex locked. This will cause a deadlock, because the signal handler cannot acquire the mutex until the thread resumes, but the thread cannot resume until the signal handler has completed.

There is a problem much worse than that. This additional problem is that a signal handler can try to acquire a mutex while another thread doesn't yet have the mutex, but is in the middle of acquiring it. You'll corrupt the execution of your program. Good luck in trying to debug that.

Quote:

Originally Posted by neonsignal

If you have code like this inside a signal handler, you either need to rework it to simplify the handler, or have the handler initiate a thread (or signal an existing one) to perform the task.

Having the handler initiate a thread won't (always) work either. Your handler might be trying to initiate a thread while another thread is either in the process of being created or in the process of being destroyed. You'll corrupt the execution of your program.

Quote:

Originally Posted by lbdgwgt

what differs actually a signal handler and a normal thread in Linux (i.e. why can't the normal thread in the problem above resumes after the signal handler locks the mutex which have been previously locked by the normal thread)?

Only one thread can have a mutex locked at a time. If a mutex has been previously locked by the normal thread, and that lock is still in place, then the signal handling thread will wait for the normal thread to unlock the mutex.

neonsignal · 02-15-2010, 09:58 PM

Quote:

what do you mean by signaling an existing thread as a solution to the problem above?

This means using one of the async-signal-safe functions such as sem_post to communicate to a thread that the signal has occurred.

Often it is easier to use something like sigwait or sigsuspend so that the thread processing the signals can wait directly on the signal, instead of it being mediated by user code.

As wje_lq points out, none of the pthread functions are async-signal-safe. This is especially dangerous, because they will appear to work 'most' of the time.