LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Multi-thread terminated unexpectedly with POSIX threads (https://www.linuxquestions.org/questions/programming-9/multi-thread-terminated-unexpectedly-with-posix-threads-4175467601/)

Ericxx 06-27-2013 11:09 AM

Multi-thread terminated unexpectedly with POSIX threads
 
Hello,

I am using POSIX threads to create several worker threads and scheduling them to run one by one. However, after some random execution time, all of the worker threads got terminated. In addition, this problem does not happen every time. Could you please kindly suggest how to debug this issue? Thanks very much.

Regards,
Eric

JohnGraham 06-27-2013 11:33 AM

Not wanting to sound patronising, but have you tried literally running the code in a debugger? Also, what do you mean by "all of the worker threads got terminated"? Did the whole program terminate? Was there a message printed to the console? What happened?

Ericxx 06-27-2013 12:03 PM

Quote:

Originally Posted by JohnGraham (Post 4979690)
Not wanting to sound patronising, but have you tried literally running the code in a debugger? Also, what do you mean by "all of the worker threads got terminated"? Did the whole program terminate? Was there a message printed to the console? What happened?

Thanks, John. Basically, I have four threads. 1st is a data acquisition, 2nd is data processing, 3rd is data communication, 4th is data logging. The four threads are scheduled via condition variables to realize execution from 1 to 4 every 100 ms. The code can be running smoothly without any problems for quite a while with the console printing, but occasionally the whole program got stuck and no more output available in the console. I have tried to run in the debug mode to locate which line of code caused this issue. But the whole OS crashes, changed from GUI to console mode. I am also wondering if the crash of my application caused the OS crash. But as this happened occasionally, I find it hard to grasp the intermediate information of the application to analyse the problem. Let me know if you need any other details. Thank you.

NevemTeve 06-27-2013 02:18 PM

> But the whole OS crashes

Perhaps your program has eaten up every single bit of memory?

JohnGraham 06-27-2013 02:51 PM

Quote:

Originally Posted by Ericxx (Post 4979712)
But the whole OS crashes

Seriously? You mean the *whole* OS? As in you have to reboot your computer? What OS are you using, and on what platform?

sundialsvcs 06-27-2013 06:37 PM

You've got a timing hole somewhere. Sux to debug it, but ... :banghead:

Ericxx 06-28-2013 09:12 AM

Quote:

Originally Posted by NevemTeve (Post 4979790)
> But the whole OS crashes

Perhaps your program has eaten up every single bit of memory?

Should not be. I have monitored the resources with the vmstat and memory should be enough. The OS crashed only occasionally. Most of the time my program hanged but the OS still keeped running. Strange point is that when the application hangs, the camera capture thread keeps running, while all the other threads get killed. So this camera capture thread might be the problem?

Ericxx 06-28-2013 09:18 AM

Quote:

Originally Posted by JohnGraham (Post 4979803)
Seriously? You mean the *whole* OS? As in you have to reboot your computer? What OS are you using, and on what platform?

I am using Ubuntu 12.04 running on an Intel Core2Duo computer stack. Today I tried more times and OS seems to be stable :) Now just trying to figure out the reason for the crash.

2ck 06-28-2013 09:25 AM

It sounds like your threads are deadlocking. Can you share your code or give a more exacting description of how the threads interact, including all synchronization constructs?

Ericxx 07-07-2013 09:20 PM

Quote:

Originally Posted by 2ck (Post 4980249)
It sounds like your threads are deadlocking. Can you share your code or give a more exacting description of how the threads interact, including all synchronization constructs?

Sorry for my late update. I should have identified the problem which is a mutex issue. The situation is that one worker thread is capturing images continuously and updating the images which are accessed by other worker threads. One image data structure is defined for storing the updated image from the capture thread and accessed by other worker threads. I tried to use the mutex mechanism, but the application still got crashed randomly once executed. Looking forward to your suggestions on this. Thank you.

Regards,
Eric

2ck 07-08-2013 12:41 AM

I read your description again and it actually doesn't sound like deadlock, so sorry about that. sundialsvcs is right that it's likely some race condition in your code...

If there are multiple ways to access an image as it is being processed, then make sure the same mutex lock surrounds the code in each of them. Also, there is a 'try_acquire' or similar in the POSIX threads API. If you are using that, you should consider whether the thread that tries to acquire the lock does something that assumes the lock is held. That said, you should minimize the places where you can access the shared image, probably to synchronous read/write methods and have all access occur through those -- at least to start with.

That's all I can think of right now. A debugger probably would help since you'll be able to see what the threads were doing when they crashed. You'll still need to do some reasoning about why what they were doing was allowed to happen.

Good luck!


All times are GMT -5. The time now is 09:33 PM.