Python：如何從子進程的標准輸出 stream

Question

我的目標是實現一個微小的 Python 編寫的腳本來更容易地處理 Jupyter。

因此我寫了這個腳本：

import signal
import socket
import subprocess
import sys

sp = None
port = 8888


def get_own_ip():
    s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    try:
        s.connect(('1.1.1.1', 1))
        IP = s.getsockname()[0]
    except:
        IP = '127.0.0.1'
    finally:
        s.close()
    return IP


def signal_handler(sig, frame):
    # terminates Jupyter by sending two SIGINTs to it
    if sp is not None:
        # send termination to jupyter
        sp.send_signal(signal.SIGINT)
        sp.send_signal(signal.SIGINT)

        sys.exit(0)


if __name__ == "__main__":

    own_ip = get_own_ip()

    sp = subprocess.Popen(["jupyter-notebook"
                           , "--ip='%s'" % own_ip
                           , "--port=%i" % port
                           , "--no-browser"],
                          stdout=subprocess.PIPE,
                          stdin=subprocess.PIPE,
                          bufsize=1)

    print(sp)

    signal.signal(signal.SIGINT, signal_handler)

    with sp.stdout:
        print('read')
        for line in sp.stdout.readline():
            print('line: %s' % line)
    print('wait')
    sp.wait()  # wait for the subprocess to exit

首先，我檢索我的 IP 地址，以便將其用作 Jupyter 的參數。 然后我運行 Jupyter，然后我想在 Jupyter 運行時從 Jupyter ( stdout ) 過濾一些 output 。 但似乎sp.stdout.readline()阻塞。

上面的代碼向終端生成以下 output ：

/usr/bin/python3.6 /home/alex/.scripts/get_own_ip.py
<subprocess.Popen object at 0x7fa956374240>
read
[I 22:43:31.611 NotebookApp] Serving notebooks from local directory: /home/alex/.scripts
[I 22:43:31.611 NotebookApp] The Jupyter Notebook is running at:
[I 22:43:31.611 NotebookApp] http://192.168.18.32:8888/?token=c4b7784d784206fc357b8f484b8d659fed6a2b1733b46ae6
[I 22:43:31.611 NotebookApp]  or http://127.0.0.1:8888/?token=c4b7784d784206fc357b8f484b8d659fed6a2b1733b46ae6
[I 22:43:31.611 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 22:43:31.614 NotebookApp] 

    To access the notebook, open this file in a browser:
        file:///home/alex/.local/share/jupyter/runtime/nbserver-18280-open.html
    Or copy and paste one of these URLs:
        http://192.168.18.32:8888/?token=c4b7784d784206fc357b8f484b8d659fed6a2b1733b46ae6
     or http://127.0.0.1:8888/?token=c4b7784d784206fc357b8f484b8d659fed6a2b1733b46ae6

可以看到出現了 output ，但不會被sp.stdout.readline()識別。

如何從sp.stdout ？

按照@Douglas Myers-Turnbull 的提示，我將主要的 function 更改為：

if __name__ == "__main__":

    own_ip = get_own_ip()
    # store ip as byte stream
    own_ip_bstr = own_ip.encode()

    sp = subprocess.Popen(["jupyter-notebook"
                           , "--ip='%s'" % own_ip
                           , "--port=%i" % port
                           , "--no-browser"],
                          stderr=subprocess.PIPE,
                          stdin=subprocess.PIPE,
                          bufsize=1)

    # set up handler to terminate jupyter
    signal.signal(signal.SIGINT, signal_handler)

    with open('jupyter.log', mode='wb') as flog:
        for line in sp.stderr:
            flog.write(line)
            if own_ip_bstr in line.strip():
                with open('jupyter.url', mode='w') as furl:
                    furl.write(line.decode().split('NotebookApp] ')[1])
                break

        for line in sp.stderr:
            flog.write(line)

Answer 1

您需要捕獲 stderr ！

我認為這些消息被寫入標准錯誤而不是標准輸出。 所以你需要使用sp.stderr來代替。 這在 python logging框架中很常見。

您可以通過在 shell 中運行它來測試這種情況（如果您在 Linux 上）：

jupyter notebook > stdout.log 2> stderr.log

如果緩沖區開始填滿...

僅使用 jupyter notebook 的 output 可能不會遇到此問題，但我之前遇到過一個錯誤，即 output 緩沖區在我的調用代碼可以處理之前已填滿。 您需要確保您的代碼處理來自 stdout（和/或 stderr）的行至少與 jupyter notebook 寫入行的速度一樣快。 如果不是這種情況，您可以通過將它們填充到隊列中來處理這些行。 像這樣的東西：

    def _reader(cls, pipe_type, pipe, queue):
        """Read in lines of text (utf-8) and add them into the queue."""
        try:
            with pipe:
                for line in iter(pipe.readline, b""):
                    queue.put((pipe_type, line))
        finally:
            queue.put(None)
#
    def stream_cmd(log_callback):
        """Stream lines of stdout and stderr into a queue, then call log_callback on them.
        By putting the lines into a queue and processing with log_callback on another thread, it's ok if log_callback takes a bit longer than the output.
        """
        p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=cwd, bufsize=1)
        try:
            q = Queue()
            Thread(target=_reader, args=[1, p.stdout, q]).start()
            Thread(target=_reader, args=[2, p.stderr, q]).start()
            for _ in range(2):
                for source, line in iter(q.get, None):
                    log_callback(source, line)
            exit_code = p.wait(timeout=timeout_secs)
        finally:
            p.kill()
        if exit_code != 0:
            raise subprocess.CalledProcessError(
                exit_code, " ".join(cmd), "<<unknown>>", "<<unknown>>"
            )

我之前已經成功使用過類似的代碼，但該代碼中可能存在錯誤。

Python：如何從子進程的標准輸出 stream

問題描述

1 個解決方案

解決方案1
3 已采納 2020-05-05 23:52:53

您需要捕獲 stderr ！

如果緩沖區開始填滿...

Python：如何從子進程的標准輸出 stream

問題描述

1 個解決方案

解決方案1 3 已采納 2020-05-05 23:52:53

您需要捕獲 stderr ！

如果緩沖區開始填滿...

解決方案1
3 已采納 2020-05-05 23:52:53