Hello,
I am writing parallel code for a physics simulation that creates and joins threads very fast. I never have more than 5 or 6 threads active at once, however. After the simulation runs for about half a million iterations, pthread_create starts returning EAGAIN and not creating the thread. This causes long pauses because I have to while(pthread_create);.
As another note, Windows has a maximum thread limit of 2019 or so, and the program works perfectly on Windows. The maximum thread limit of the UNIX system that the problem exists on has a thread limit of over 8000. Also, by watching the number of active threads in top and task manager, I can see that it is not increasing over time.
I also know that there is plenty of memory for the program, a lack of memory would likely cause one of many other malloc()'s to fail, not a tiny little pthread_create call.
What are possible causes for this issue? I tried memset()-ing my pthread variables to zero before calling pthread_create and this actually made the problem occur less often.
Here is the block that creates the threads...
Code:
printf("|%d|", me); fflush(stdout);
// Create work threads for each interaction
for (i=0, j=me; i<=p[me].t_num; j=p[me].t_box[i++]) {
memset(p[j].a_, 0, p[j].num*12);
if (i) {
printf("[a"); fflush(stdout);
k = pthread_create(&attach_t[i-1], NULL, attach, &i);
printf("%d", k); fflush(stdout);
if (k) exit(0);
pthread_mutex_lock(&arg3_mutex);
printf("]"); fflush(stdout);
} else {
printf("[c"); fflush(stdout);
k = pthread_create(&calc_t, NULL, calc, NULL);
printf("%d]", k); fflush(stdout);
if (k) exit(0);
}
}
printf("(r)"); fflush(stdout);
...
// Join work threads
pthread_join(calc_t, NULL);
memset(&calc_t, 0, sizeof(pthread_t));
for (i=0; i<p[me].t_num; i++) pthread_join(attach_t[i], NULL);
memset(attach_t, 0, p[me].t_num*sizeof(pthread_t));
Any suggestions would be appreciated. The code must run stably for a long time on both windows and unix.
Thanks,
Adam