I just want to understand what happens in the "background" in terms of memory usage when dealing with a subprocess.Popen() result and reading line by line. Here's a simple example.
Given the following script test.py
that prints "Hello" then waits 10s and prints "world":
import sys
import time
print ("Hello")
sys.stdout.flush()
time.sleep(10)
print ("World")
Then the following script test_sub.py
will call as a subprocess 'test.py', redirect the stdout to a pipe and then read it line by line:
import subprocess, time, os, sy
cmd = ["python3","test.py"]
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, universal_newlines = True)
for line in iter(p.stdout.readline, ''):
print("---" + line.rstrip())
In this case my question would be, when I run test_sub.py
after it does the subprocess call, it will print "Hello" then wait 10s until "world" comes and then print it, what happens to "Hello" during those 10s of waiting? Does it get stored in memory until test_sub.py
finishes, or does it get tossed away in the first iteration?
This may not matter to much for this example, but when dealing with really big files it does.
what happens to "Hello" during those 10s of waiting?
The "Hello"
(in the parent) is available via line
name until .readline()
returns the second time ie, "Hello"
lives at the very least until the output of print("World")
is read in the parent.
If you mean what happens in the child process then after sys.stdout.flush()
there is no reason for "Hello"
object to continue to live but it may eg, see Does Python intern strings?
Does it get stored in memory until test_sub.py finishes, or does it get tossed away in the first iteration?
After .readline()
returns the second time, line
refers to "World"
. What happens with "Hello"
after that depends on the garbage collection in the specific Python implementation ie, even if line
is "World"
; the object "Hello"
may continue to live for some time. Releasing memory in Python .
You could set PYTHONDUMPREFS=1
envvar and run your code using a debug python
build, to see object that are alive when the python
process exits. For example, consider this code:
#!/usr/bin/env python3
import threading
import time
import sys
def strings():
yield "hello"
time.sleep(.5)
yield "world"
time.sleep(.5)
def print_line():
while True:
time.sleep(.1)
print('+++', line, file=sys.stderr)
threading.Thread(target=print_line, daemon=True).start()
for line in strings():
print('---', line)
time.sleep(1)
It demonstrates that line
is not rebound until the second yield
. The output of PYTHONDUMPREFS=1 ./python . |& grep "'hello'"
PYTHONDUMPREFS=1 ./python . |& grep "'hello'"
shows that 'hello'
is still alive when python
exits.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.