LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 09-25-2006, 02:05 PM   #1
emge1
LQ Newbie
 
Registered: Sep 2006
Posts: 15

Rep: Reputation: 0
multithreaded programming - monitoring applications


Hi, i am writing a program that launches multiple applications each of them as a separate process. I am using fork and execve to accomplish this.
What i'd like to do is find a efficient way to watch over this applications. Once I launch an application, i'd like to find out if it exits normally, or if it fails for any reason. if it fails over i'd like to relaunch it.

I was thinking of having another process as a watch process, and calling waitpid() for each application, i don't know if i can call it waitpid multiple times or if they would block each other. Worse case scenario was to have a thread for each application and wait until each application terminated, but I think this is not efficient at all.

how can I get signaled when any of the application that are launched which run in child processes exit normally or fail. My understanding is that wait calls block that process until it gets signaled, but since there is multiple applications running at the same time, more than one can exit or terminate at the same time, so how can i be notified that more than application terminated.

Ideally i'd like to find out as much details of why it failed, wether the applications crashed because of a seg fault or if its hung in an infinite loop, etc. I do not know if there's a way to find this out. is there?

any help or ideas would be greatly appreciated.
 
Old 09-25-2006, 03:24 PM   #2
Mara
Moderator
 
Registered: Feb 2002
Location: Grenoble
Distribution: Debian
Posts: 9,536

Rep: Reputation: 148Reputation: 148
You can try to use ptrace, it may be fine in your situation.

Another solution is to use a wrapper over the program - simple shell script/program, that would send an info to specific program on exit.
 
Old 09-25-2006, 03:57 PM   #3
emge1
LQ Newbie
 
Registered: Sep 2006
Posts: 15

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Mara
You can try to use ptrace, it may be fine in your situation.

Another solution is to use a wrapper over the program - simple shell script/program, that would send an info to specific program on exit.

thank you, the wrapper over the program, seems like it might be a better solution. what do you mean by "send an info", do you mean a SIGNAL like an interrupt?
 
Old 09-26-2006, 04:00 PM   #4
Mara
Moderator
 
Registered: Feb 2002
Location: Grenoble
Distribution: Debian
Posts: 9,536

Rep: Reputation: 148Reputation: 148
Signal is one of the options, but not the one I prefer. I'd choose a message via pipe or local socket.
 
Old 09-26-2006, 04:22 PM   #5
tuxdev
Senior Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 2,012

Rep: Reputation: 111Reputation: 111
Perhaps add the other processes into the parent process group, then waitpid on the negation of the parent process.
 
Old 09-28-2006, 02:27 PM   #6
emge1
LQ Newbie
 
Registered: Sep 2006
Posts: 15

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by tuxdev
Perhaps add the other processes into the parent process group, then waitpid on the negation of the parent process.

My design has actually changed a bit and i have another question. Let's say the main processes creates two threads via pthread_create(), say pthread_1 and pthread_2, then pthread_1 calls fork(), let say it returns process id = 4.

what i'd like to do is use the 2nd thread "pthread_2" and have it call wait4(-1, &status, options, &rusage), and wait for the processes created via the fork() call by another thread, in this case "thread_1"

is this possible? i believe that since the threads are actually in the same process, when one of them forks a child process, that child process is a child of both threads? correct?

will this work?

Also, i've never changed the group process id, that seems like another option, but how does it work? any body has any sample code... what is the default group process id, what are the risks or what else happens besides assigning a group process id to a process?

thanks.....
 
Old 09-28-2006, 02:49 PM   #7
tuxdev
Senior Member
 
Registered: Jul 2005
Distribution: Slackware
Posts: 2,012

Rep: Reputation: 111Reputation: 111
Mixing threads and processes makes my head hurt. What you may or may not want to consider is that in LinuxThreads, new threads really are new processes with joined memory. You can verify that by asking for the PID of the spawned threads. You can still make a implementation decision to not support LinuxThreads, since they are not POSIX. I'd just recommend sticking with either threads or processes and save the hassle of figuring out the sematics of mixing them, if you can.

http://www.csc.calpoly.edu/~tgoya/vssh.tar.gz
This is a really simple shell implementation that uses process groups that I did for a homework project. I don't guarantee that the link will stay valid.

Last edited by tuxdev; 09-28-2006 at 02:50 PM.
 
Old 09-28-2006, 03:44 PM   #8
emge1
LQ Newbie
 
Registered: Sep 2006
Posts: 15

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by tuxdev
Mixing threads and processes makes my head hurt. What you may or may not want to consider is that in LinuxThreads, new threads really are new processes with joined memory. You can verify that by asking for the PID of the spawned threads. You can still make a implementation decision to not support LinuxThreads, since they are not POSIX. I'd just recommend sticking with either threads or processes and save the hassle of figuring out the sematics of mixing them, if you can.

http://www.csc.calpoly.edu/~tgoya/vssh.tar.gz
This is a really simple shell implementation that uses process groups that I did for a homework project. I don't guarantee that the link will stay valid.

i agree with you that his can be a head ache, but as of now, i think thats what i need to implement for the following reasons:

1) the process that I fork has to be its own process since i call execve on it, and I can't lose the other threads once i do that.

2) the reason to make "thread_1" and "thread_2" threads and not processes is because I need to save memory, and i need those threads to access and change data structures created by the main process that created those threads.

so if they are or are not a new process, i still have the same original question. Is "thread_2" still considered the parent process of proceses forked by "thread_1" ???
 
Old 09-28-2006, 07:26 PM   #9
emge1
LQ Newbie
 
Registered: Sep 2006
Posts: 15

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by emge1
i agree with you that his can be a head ache, but as of now, i think thats what i need to implement for the following reasons:

1) the process that I fork has to be its own process since i call execve on it, and I can't lose the other threads once i do that.

2) the reason to make "thread_1" and "thread_2" threads and not processes is because I need to save memory, and i need those threads to access and change data structures created by the main process that created those threads.

so if they are or are not a new process, i still have the same original question. Is "thread_2" still considered the parent process of proceses forked by "thread_1" ???

I have another question. what happens when you call waitpid() and the application its waiting on crashes?

the man pages say this:
Quote:
waitpid(): on success, returns the process ID of the child whose state has changed; on error, -1 is returned; if WNOHANG was specified and no child(ren) specified by pid has yet changed state, then 0 is returned.
so does waitpid() return -1 if the app crashes? or does it return some type of signal or something that caused it to crash?
 
Old 09-28-2006, 08:04 PM   #10
sundialsvcs
Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 5,346

Rep: Reputation: 1104Reputation: 1104Reputation: 1104Reputation: 1104Reputation: 1104Reputation: 1104Reputation: 1104Reputation: 1104Reputation: 1104
One program that's been doing this sort of thing since the very beginnings of Unix/Linux is init, which basically starts everything up and then sits around waiting for things to die. According to /etc/inittab, it can start them up again.

Basically, when a process dies, it becomes a "zombie" until its parent (or if there is no parent, init) "reaps" it. This lets you collect the status of the defunct process.

It is often the case that the progenitor process in a multi-thread application does little more than handle the spawning and reaping of child processes and threads, the latter of which actually do all of the work. It is also common that one of the children is basically a "watchdog" that wakes up periodically just to see if something is amiss.
 
Old 09-29-2006, 01:36 AM   #11
Hko
Senior Member
 
Registered: Aug 2002
Location: Groningen, The Netherlands
Distribution: ubuntu
Posts: 2,530

Rep: Reputation: 108Reputation: 108
Quote:
Originally Posted by emge1
I was thinking of having another process as a watch process, and calling waitpid() for each application, i don't know if i can call it waitpid multiple times or if they would block each other. Worse case scenario was to have a thread for each application and wait until each application terminated, but I think this is not efficient at all.
Having yet another process to watch the children of another won't work if it's not the parent of the applications started... This will only make things more complex without a real reason.

Quote:
how can I get signaled when any of the application that are launched which run in child processes exit normally or fail. My understanding is that wait calls block that process until it gets signaled, but since there is multiple applications running at the same time, more than one can exit or terminate at the same time, so how can i be notified that more than application terminated.
Normally wait() and waitpid() block until a a child changed status (e.g. exited, stopped,..). But you can specify the WNOHANG ("return immediately if no child has exited.") option if you use waitpid() instead of just wait().

However I think you dont need to worry about multiple threads and blocking at all. If understand you correctly you just need to call wait(&status) waitpid(-1, &status, 0) in a loop. When two or three or a hundred processes exit at the "same" time, your call to wait() will return immediately until all exited children are "reaped". Then it will start blocking again until the next child exits.

A exited child process waits in zombie-state (defunct-state) until the parent calls wait(). It's not a problem at all if this takes (say) 1 second.

If you need to do other things in the parent processes' loop that take time, just use waitpid(-1, &status, 0) instead and make sure the loop runs at least once in a second or so. If your other activities in the parent process may also block, just register a the SIGCHLD signal handler, so your parent will "know" when it need to call wait().

Last edited by Hko; 09-29-2006 at 01:49 AM.
 
Old 09-29-2006, 10:38 AM   #12
emge1
LQ Newbie
 
Registered: Sep 2006
Posts: 15

Original Poster
Rep: Reputation: 0
Quote:
Having yet another process to watch the children of another won't work if it's not the parent of the applications started... This will only make things more complex without a real reason.
I am actually not having another process to watch the children, its a thread. A thread should work right?

Quote:
Normally wait() and waitpid() block until a a child changed status (e.g. exited, stopped,..). But you can specify the WNOHANG ("return immediately if no child has exited.") option if you use waitpid() instead of just wait().
yes i am going to use waitpid(-1, &status, 0) or wait4(-1, &status, 0, rusage) with the WNOHANG option.

Quote:
However I think you dont need to worry about multiple threads and blocking at all. If understand you correctly you just need to call wait(&status) waitpid(-1, &status, 0) in a loop. When two or three or a hundred processes exit at the "same" time, your call to wait() will return immediately until all exited children are "reaped". Then it will start blocking again until the next child exits.
are you saying i need to call TWO calls: wait(&status) AND waitpid(-1, &status, 0) or one of the two?

what happens when when there's multiple zombie proceses in between waitpid() or wait4() calls?
does it return the status on the first one, reaps the first one, and the other zombie proceses stay in the process table until waitpid or wait4 is called again? or does one call to waitpid or wait4() reap all the zombie processess?

Quote:
A exited child process waits in zombie-state (defunct-state) until the parent calls wait(). It's not a problem at all if this takes (say) 1 second.

If you need to do other things in the parent processes' loop that take time, just use waitpid(-1, &status, 0) instead and make sure the loop runs at least once in a second or so. If your other activities in the parent process may also block, just register a the SIGCHLD signal handler, so your parent will "know" when it need to call wait().
instead of what? do you mean waitpid() instead of wait()?
the things that i need to do in between wait calls are: figure out what the status was, to send a message to another process to report the status of apps, get ruasge info, and use a semaphore to make sure there's still apps running. the one that might block is the semaphore, but that should only happen if there no apps to watch over for.

whats the best way to wake up this thread and have it run only once every sec or so? should i sleep for (500 ms) or something like that?

Yet another question. What happens when an app crashes? does that app also become a zombie process? and then it can be waited on? or is there no way to find out if an app crashes with wait?

Last edited by emge1; 09-29-2006 at 10:42 AM.
 
Old 09-30-2006, 04:40 PM   #13
Hko
Senior Member
 
Registered: Aug 2002
Location: Groningen, The Netherlands
Distribution: ubuntu
Posts: 2,530

Rep: Reputation: 108Reputation: 108
Quote:
Originally Posted by emge1
I am actually not having another process to watch the children, its a thread. A thread should work right?
I'm not sure. I'd think so, but the Linux implementation of threads gives every thread another PID (which is, I understand, uncommon). That fact makes me unsure about this.

Quote:
Originally Posted by emge1
are you saying i need to call TWO calls: wait(&status) AND waitpid(-1, &status, 0) or one of the two?
No. I Just forgot type "or" between them. Sorry if that confused you.

Quote:
Originally Posted by emge1
what happens when when there's multiple zombie proceses in between waitpid() or wait4() calls?
does it return the status on the first one, reaps the first one, and the other zombie proceses stay in the process table until waitpid or wait4 is called again? or does one call to waitpid or wait4() reap all the zombie processess?
All wait...() functions return the status of one process. The next zombie will be reaped the next time you call one of the wait...() functions. This means a process may be a zombie for a short time. But this is no problem at all, since it's parent still lives and will reap it at some moment.

Quote:
Originally Posted by emge1
instead of what? do you mean waitpid() instead of wait()?
I meant to say: instead of running a seperate process watching and reaping the apps as they run and terminate (an different process than the one that starts the apps. That's what I thought you were trying to do). And instread of running a seperate "watch-thread" for each app running.

Quote:
Originally Posted by emge1
the things that i need to do in between wait calls are: figure out what the status was, to send a message to another process to report the status of apps, get ruasge info, and use a semaphore to make sure there's still apps running. the one that might block is the semaphore, but that should only happen if there no apps to watch over for.

whats the best way to wake up this thread and have it run only once every sec or so? should i sleep for (500 ms) or something like that?
I'd would try using wait4() so you can get the rusage and the exit-status in one go. But without "NOHANG", so you reaping-loop will block unless there's a process to reap. Then you won't even need to wake it up once in a sec or so. It will reap the status and rusage when a process exited, and will block (sleep, do nothing) if there's nothing to do... This is what I tried to explain in my first post.

Then you also don't need a semaphore It think.

Quote:
Originally Posted by emge1
Yet another question. What happens when an app crashes? does that app also become a zombie process? and then it can be waited on? or is there no way to find out if an app crashes with wait?
I'm not sure, but what I think is: A program that really crashes receives a signal (e.g. segfault) and dies. It's status will be that it was terminated by signal, which you can detect from the status you got from a wait...() function with something like:
Code:
if (WIFSIGNALED(status)) {
    /* do something */
}
Note: I'm not an expert, and do not entirely understand what you trying to do.
 
Old 10-02-2006, 10:00 AM   #14
emge1
LQ Newbie
 
Registered: Sep 2006
Posts: 15

Original Poster
Rep: Reputation: 0
Quote:

Note: I'm not an expert, and do not entirely understand what you trying to do.

Thanks so much for your help, it helps a great deal.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
new to multithreaded programming... why is TLS needed? chlimouj Linux - Kernel 1 08-30-2006 06:56 AM
multithreaded programming and TLS chlimouj Programming 2 08-29-2006 01:00 AM
c programming: monitoring user's mail system soararing Programming 2 07-18-2005 01:18 AM
Programming KDE applications??? Hachaso Linux - Software 1 10-04-2004 10:01 AM


All times are GMT -5. The time now is 02:05 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration