Quote:
Originally Posted by JohnGraham
Is this apparent from the pthread_mutex_lock.c source code (which I don't have to hand)? Otherwise, I can't see how you can make that link - because the assertion seems to happen within the call to pthread_mutex_lock, it hasn't returned EDEADLK, since it hasn't returned anything - it's asserted and aborted before its time's up.
|
I see what you trying to say.
I'm attaching pthread_mutex_lock.c
This is the location of the assert.
258 oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock,
259 newval, 0);
260
261 if (oldval != 0)
262 {
263 /* The mutex is locked. The kernel will now take care of
264 everything. */
265 INTERNAL_SYSCALL_DECL (__err);
266 int e = INTERNAL_SYSCALL (futex, __err, 4, &mutex->__data.__lock,
267 FUTEX_LOCK_PI, 1, 0);
268
269 if (INTERNAL_SYSCALL_ERROR_P (e, __err)
270 && (INTERNAL_SYSCALL_ERRNO (e, __err) == ESRCH
271 || INTERNAL_SYSCALL_ERRNO (e, __err) == EDEADLK))
272 {
273 assert (INTERNAL_SYSCALL_ERRNO (e, __err) != EDEADLK
274 || (kind != PTHREAD_MUTEX_ERRORCHECK_NP
275 && kind != PTHREAD_MUTEX_RECURSIVE_NP));
276 /* ESRCH can happen only for non-robust PI mutexes where
277 the owner of the lock died. */
278 assert (INTERNAL_SYSCALL_ERRNO (e, __err) != ESRCH || !robust);
279
280 /* Delay the thread indefinitely. */
281 while (1)
282 pause_not_cancel ();
283 }
284
285 oldval = mutex->__data.__lock;
286
287 assert (robust || (oldval & FUTEX_OWNER_DIED) == 0);
288 }
I got misled by this code here. I thought this is what should get executed for mutex of type PTHREAD_MUTEX_RECURSIVE.
239 if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
240 {
241 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
242
243 /* Just bump the counter. */
244 if (__builtin_expect (mutex->__data.__count + 1 == 0, 0))
245 /* Overflow of the counter. */
246 return EAGAIN;
247
248 ++mutex->__data.__count;
249
250 return 0;
251 }
However this is where the PTHREAD_MUTEX_RECURSIVE case gets handled right in the beginning.
46 switch (__builtin_expect (mutex->__data.__kind, PTHREAD_MUTEX_TIMED_NP))
47 {
48 /* Recursive mutex. */
49 case PTHREAD_MUTEX_RECURSIVE_NP:
50 /* Check whether we already hold the mutex. */
51 if (mutex->__data.__owner == id)
52 {
53 /* Just bump the counter. */
54 if (__builtin_expect (mutex->__data.__count + 1 == 0, 0))
55 /* Overflow of the counter. */
56 return EAGAIN;
57
58 ++mutex->__data.__count;
59
60 return 0;
61 }
62
63 /* We have to get the mutex. */
64 LLL_MUTEX_LOCK (mutex->__data.__lock);
65
66 assert (mutex->__data.__owner == 0);
67 mutex->__data.__count = 1;
68 break;
My mutex is set to:
pthread_mutexattr_settype(&mutexAttrib, PTHREAD_MUTEX_RECURSIVE);
(PTHREAD_MUTEX_RECURSIVE = PTHREAD_MUTEX_RECURSIVE_NP)
So is it correct to say that since it did not go in case PTHREAD_MUTEX_RECURSIVE_NP, means that the mutex data structure was corrupted??
Quote:
Originally Posted by JohnGraham
If you're sure EDEADLK is returned (or about to be returned), have you made sure that all the relevant calls to pthread_mutexattr_{init,settype} are (a) made correctly and (b) have error conditions spotted and dealt with appropriately? If such an error is logged, the logs may show some reason why the PTHREAD_MUTEX_RECURSIVE setting couldn't be used - can't think why, but that's computers for you I guess...
|
That's a good suggestion. I will check that if I see it again.
Thanks again
Nikhil