Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
We are running a family of remote data collection devices (2.6.21 on ARM) that store data onto a USB memory stick. Because the write times are so long onto the stick, we have opted to have a child process handle the file copy so that the parent is unfettered in its quest to continue collecting geomagnetic data. And this has us at an odd tradeoff decision, (which I am sure the gurus here can answer).
We are using the traditional fork/execl and waitpid from the C API. New files are written to a ramdisk and need to be copied to memory stick every 12 minutes. File copy to memory stick takes ~2 minutes.
Choice 1 : child finishes file copy and exits, entering the defunct state. The parent then waits on the defunct child just before it forks another child. This means that the child is sitting in the defunct state for about 10 minutes before it is waited on.
Choice 2 : child finishes file copy and sleeps. The parent then axes the sleeping child just before it forks another child.
Other choices?
Anyone know which approach might give us the most stable system?
The first choice -- letting the child stay in defunct state -- is perfectly okay. The kernel releases the resources used by the child, only keeping the process ID and exit status, so there is really no downsides for this.
You can call waitpid(childpid,&status,WNOHANG) every now and then to see if the child has exited yet. It will return childpid if the child has exited, 0 if it is still running, and -1 (and errno set) if an error has occurred. In your case, I don't think it is necessary -- as I said, it is perfectly okay to let the child stay defunct until just before the next fork.
You are assuming that the normal timing of a 2 minute write every 12 minutes will always hold true. Suppose you have some kind of error condition where the write has not completed in 12 minutes. In that case your solution 1 will wait for the write to finish. If the write actually finishes, but late, then solution 1 will preserve your file integrity. If the write never finishes then solution 1 will hang your parent process.
Under the same error condition solution 2 will lose the data still waiting to be written and possibly might corrupt the file. But the parent process will not hang.
I suggest that you use solution 1 but test the wait condition instead of issuing an unconditional wait. If the test shows that the child process is not finished then treat it as an error condition.
---------------------
Steve Stites
Edit: After carefully rereading Nominal Animal's reply I think that his answer is the same as mine, just with different wording.
Yes, it is definitely a good idea to use waitpid(childpid,&status,WNOHANG) to reap the child process without waiting for it to exit. If sufficient time has passed to indicate the child has hung, kill the child via kill(childpid,SIGKILL); and reap it using waitpid(childpid,&status,0) .
If the application is critical, you can install a dummy SIGALRM signal handler (empty body), and set a timeout (using alarm(seconds)) before the waitpid() call; the alarm signal will interrupt the waitpid() call, which will then return (pid_t)-1 with errno set to EINTR. If the waitpid() call is successful, alarm(0) will defuse the timeout. This is extremely robust and reliable.
As to the other choices, you could always install a SIGCHLD handler, and use waitpid(childpid,&status,WNOHANG) in the signal handler -- waitpid() is async-signal-safe, thus okay to use in a signal handler -- to reap the child whenever it exits. It is a bit tricky to implement correctly, because the status must be accessed atomically (as other parts of the code might be midway reading the status just when the signal handler is triggered), and the child may exit at any time after the parent fork()s. In particular, you cannot assume that the main program can save the child PID (to be tried by the signal handler) before the SIGCHLD signal handler may be run; the signal handler must rely on the si_pid field in the siginfo_t structure, or you risk missing a signal. Considering the complexity, I would only use the signal approach in an asynchronous and/or multithreaded program.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.