Python's subprocess.run() method is held up by stdout

MALDATA · 09-05-2017, 04:50 PM

Hi all,

Here are a few short scripts that demonstrate the issue that I had:

Parent script:

Code:

#!/bin/bash

echo "PARENT: about to start the child script in the background..."

./childbash &

echo "PARENT: child script has been started in the background."

Child script:

Code:

#!/bin/bash

echo "CHILD: about to start a loop that runs forever..."

while (true); do
      echo "CHILD: Looping..."
      sleep 2
done

echo "CHILD: done looping."

If you were to run the parent script in bash by simply invoking "./parentbash", it would start and exit almost immediately, returning you to the bash prompt. The childbash script would keep running in the background, forever printing to stdout until you kill it.

If you try to run the parentbash script from python and capture stdout, the parentbash script will never die.

Code:

import subprocess
import time

def main():
    print("PYTHON: running the parent script...")
    result = subprocess.run(['./parentbash'], stdout=subprocess.PIPE)
    print("PYTHON: application ending. Return code: {0}".format(result.returncode))

if __name__ == '__main__':
    main()

When you run this, the call to subprocess.run() will never return. If you remove the "stdout=" argument, it behaves as expected. The subprocess.run() call returns, the python application exits, and the childbash script keeps running in the background.

In my case, I didn't actually need stdout for anything, so I just removed it and moved on, but I was surprised that that was what caused the problem. In my googling, I found a similar issue here that talks about closing file descriptors. It seems like this is the issue, since both parentbash and childbash are using stdout, but there isn't much discussion at that link, so I'm not 100% sure I understand why we get this behavior.

Can anyone share a little more information on this? If I didn't know anything about how childbash worked, how would I ever know that it was using stdout and that that was the issue in my python application? What is a good way of "closing all file descriptors in your background scripts," particularly if it only uses stdout for occasional printing, as in this case? Is stdout always considered open, and always shared between a parent script and any applications it starts? Are there best practices for this?

If you have any thoughts, please let me know. Thanks!

Sefyir · 09-05-2017, 06:10 PM

Python Cookbook by David Beazley has a script for launching a daemon process. Perhaps it might be a useful reference?

https://github.com/dabeaz/python-cookbook
https://github.com/dabeaz/python-coo...unix/daemon.py

Code:

    # Replace file descriptors for stdin, stdout, and stderr
    with open(stdin, 'rb', 0) as f:
        os.dup2(f.fileno(), sys.stdin.fileno())
    with open(stdout, 'ab', 0) as f:
        os.dup2(f.fileno(), sys.stdout.fileno())
    with open(stderr, 'ab', 0) as f:
os.dup2(f.fileno(), sys.stderr.fileno())

From the book

Quote:

Once the daemon process has been properly detached, it performs steps to reinitialize the standard I/O streams to point at files specified by the user. This part is actually somewhat tricky. References to file objects associated with the standard I/O streams are found in multiple places in the interpreter (sys.stdout, sys.__stdout__, etc.). Simply closing sys.stdout and reassigning it is not likely to work correctly, because there’s no way to know if it will fix all uses of sys.stdout. Instead, a separate file object is opened, and the os.dup2() call is used to have it replace the file descriptor currently being used by sys.stdout. When this happens, the original file for sys.stdout will be closed and the new one takes its place. It must be emphasized that any file encoding or text handling already applied to the standard I/O streams will remain in place.

NevemTeve · 09-06-2017, 01:05 AM

@OP: Are you just experimenting, or do you actually want to create daemon (aka batch) processes? In the latter case, be sure to redirect both stdout and stderr into files. Also to ignore SIGHUP (it is trap '' HUP in shell). And to 'chdir' into some known location (such as /).

pan64 · 09-06-2017, 02:58 AM

would be nice to read the documentation:
https://docs.python.org/3/library/subprocess.html

Code:

 subprocess.run(args, *, stdin=None, input=None, stdout=None, stderr=None, shell=False, cwd=None, timeout=None, check=False, encoding=None, errors=None)
    Run the command described by args. Wait for command to complete, then return a CompletedProcess instance.

....

If you want to "release" the process but capture stdout you need to use Popen

MALDATA · 09-06-2017, 09:58 AM

@Sefyir: That's a good reference, thank you! Looks like doing this would require some work to redirect all the standard streams.

@NevemTeve: I'm experimenting. In this case, I don't need stdout, I just wasn't thinking when I put it in the subprocess.run() arguments. I was surprised by the result and wanted to hear some discussion to make sure I understood the behavior. When you say

Quote:

be sure to redirect both stdout and stderr into files

do you mean that redirection should happen in the python script (as Sefyir demonstrated), the parent bash script, the child bash script, or some combination? Thanks!

@pan64: Wouldn't be a forum post without someone telling me to read the docs. I have read through the subprocess module documentation, but this appears to be an edge case in which the docs are ambiguous. I am fine with waiting for the parent script to complete. However, since the parent script starts a child script in the background, my expectation is that the parent is complete as soon as the child starts running. If subprocess.run() is called with a stdout argument, the parent is not considered completed until the child ends. If stdout is NOT specified in subprocess.run(), then "wait for command to complete" describes a different behavior. My goal is to understand what "complete" means when standard streams (or other file descriptors) are involved.

Thanks, all. If anyone else has other thoughts, please let me know.

NevemTeve · 09-06-2017, 10:10 AM

There shouldn't be parent and child, simply

Code:

#!/bin/sh

trap '' HUP

cd /

exec >>/tmp/somestuff.log 2>&1

date

set -xv # while debugging

# do the actual job

On the second thought, the shell itself is perfectly able to launch daemons, using Python is unnecessary.