When using Python's ThreadPool
to parallelize a CPU-intensive task it seems like memory used by the workers is accumulated and not released. I've tried to simplified the problem:
import numpy as np
from multiprocessing.pool import ThreadPool
def worker(x):
# Bloat the memory footprint of this function
a = x ** x
b = a + x
c = x / b
return hash(c.tobytes())
tasks = (np.random.rand(1000, 1000) for _ in range(500))
with ThreadPool(4) as pool:
for result in pool.imap(worker, tasks):
assert result is not None
When running this snippet one can easily observe a huge jump in the memory footprint Python uses. However I would have expected this to have nearly the same behavior as
for task in tasks:
assert worker(task) is not None
whose memory cost is negligible.
How do I have to modify the snippet to apply the worker
function to each array using a ThreadPool
?
Turns out the explanation is quite simple. Modifying the the example to create the random array only inside the worker will solve the problem:
def worker(x):
x = x()
# Bloat the memory footprint of this function
a = x ** x
b = a + x
c = x / b
return hash(c.tobytes())
tasks = (lambda: np.random.rand(1000, 1000) for _ in range(500))
It seems like ThreadPools.imap
will internally turn the generator tasks
into a list or something alike. This would of course require to store all 500 random arrays in memory at once.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.