Hi all,
My program has created 15-pairs of threads associated with 15 semaphores respectively. For example, Pair-1 has T1 and T2 where T1 performs sem_wait() while T2 performs sem_post() on a semaphore, S1. In between, sem_getvalue() is called before and after of both sem_wait() and sem_post(). The same program flow is executed by the other 14 pairs of threads. Besides that, an additional thread is created to periodically check all of the 15 semaphore values with sem_getvalue().
A series of simulation tests have been carried out to evaluate the program's performance. The program ran well for a few hours. Unfortunately, it got error - miscellaneous "system call interrupted" [ERRNO 4] on sem_wait() once a while. The interrupt signal stops the sem_wait() but it does not affect T1 to continue calling the next sem_wait(). The next immediate sem_wait() is always successful - no interrupt. The interrupt signal has put me into puzzle because it is "unpredictable". Supposedly, T1 should hang forever at sem_wait() without the sem_post() calling from T2.
My questions are:
- "Who" is sending the interrupt signal? The program does not sends any interrupt signal.
- In what condition that sem_wait() will get interrupted? Is it something related to priority inversion, deadlock, NPTL or kernel issue? [Note: Threads were created with default attributes]
- How to avoid getting interrupt system call on sem_wait()?
- How shall I start troubleshooting?
For your information
bash$
uname -a
Linux LinuxDB 2.4.21-4.ELsmp #1 SMP Fri Oct 3 17:52:56 EDT 2003 i686 i686 i386 GNU/Linux
bash$
ls /lib/libpthread*
/lib/libpthread-0.10.so /lib/libpthread.so.0
bash$
ls /lib/i686/libpthread*
/lib/i686/libpthread-0.10.so /lib/i686/libpthread.so.0
bash$
/lib/libc.so.6
GNU C Library stable release version 2.3.2, by Roland McGrath et al.
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 3.2.3 20030502 (Red Hat Linux 3.2.3-20).
Compiled on a Linux 2.4.20 system on 2003-10-02.
Available extensions:
GNU libio by Per Bothner
crypt add-on version 2.1 by Michael Glad and others
linuxthreads-0.10 by Xavier Leroy
The C stubs add-on version 2.1.2.
BIND-8.2.3-T5B
NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
Glibc-2.0 compatibility add-on by Cristian Gafton
libthread_db work sponsored by Alpha Processor Inc
Thread-local storage support included.
Report bugs using the `glibcbug' script to <bugs@gnu.org>.
bash$
/lib/tls/libc.so.6
GNU C Library stable release version 2.3.2, by Roland McGrath et al.
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 3.2.3 20030502 (Red Hat Linux 3.2.3-20).
Compiled on a Linux 2.4.20 system on 2003-10-02.
Available extensions:
GNU libio by Per Bothner
crypt add-on version 2.1 by Michael Glad and others
NPTL 0.60 by Ulrich Drepper
RT using linux kernel aio
The C stubs add-on version 2.1.2.
BIND-8.2.3-T5B
NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
Glibc-2.0 compatibility add-on by Cristian Gafton
Thread-local storage support included.
Report bugs using the `glibcbug' script to <bugs@gnu.org>.
bash$
g++ --version
g++ (GCC) 3.2.3 20030502 (Red Hat Linux 3.2.3-20)
Copyright (C) 2002 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
bash$
ldd --version
ldd (GNU libc) 2.3.2
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
bash$
getconf GNU_LIBPTHREAD_VERSION
NPTL 0.60
bash$
ldd test_program
libnsl.so.1 => /lib/libnsl.so.1 (0xb75d5000)
librt.so.1 => /lib/tls/librt.so.1 (0xb75c1000)
libpthread.so.0 => /lib/tls/libpthread.so.0 (0xb75b1000)
libstdc++.so.5 => /usr/lib/libstdc++.so.5 (0xb74fe000)
libm.so.6 => /lib/tls/libm.so.6 (0xb74dc000)
libc.so.6 => /lib/tls/libc.so.6 (0xb73a5000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb739b000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb75eb000)
Makefile Compilation Option
CC = g++
CFLAGS = -g -O3 -Wall \
-DPOSIX_C_SOURCE=199506L -D_REENTRANT
LDFLAGS = -lc -mt -lnsl -lrt -lpthread
Moreover, the memory usage of the program is found decreased gradually.
The program runs on sun solaris 2.8 as well. So far, no interrupts on sem_wait().
Any help would be highly appreciated. Thanks.