简体   繁体   English

Python 多处理 apply_async “assert left > 0” AssertionError

[英]Python multiprocessing apply_async “assert left > 0” AssertionError

I am trying to load numpy files asynchronously in a Pool:我正在尝试在池中异步加载 numpy 文件:

self.pool = Pool(2, maxtasksperchild = 1)
...
nextPackage = self.pool.apply_async(loadPackages, (...))
for fi in np.arange(len(files)):
    packages = nextPackage.get(timeout=30)
    # preload the next package asynchronously. It will be available
    # by the time it is required.
    nextPackage = self.pool.apply_async(loadPackages, (...))

The method "loadPackages":方法“loadPackages”:

def loadPackages(... (2 strings & 2 ints) ...):
    print("This isn't printed!')
    packages = {
        "TRUE": np.load(gzip.GzipFile(path1, "r")),
        "FALSE": np.load(gzip.GzipFile(path2, "r"))
    }
    return packages

Before even the first "package" is loaded, the following error occurs:甚至在加载第一个“包”之前,就会发生以下错误:

Exception in thread Thread-8: Traceback (most recent call last):线程 Thread-8 中的异常:回溯(最后一次调用):
File "C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\threading.py", line 914, in _bootstrap_inner self.run() File "C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\threading.py", line 862, in run self._target(*self._args, **self._kwargs) File "C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\multiprocessing\\pool.py", line 463, in _handle_results task = get() File "C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\multiprocessing\\connection.py", line 250, in recv buf = self._recv_bytes() File "C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\multiprocessing\\connection.py", line 318, in _recv_bytes return self._get_more_data(ov, maxsize) File "C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\multiprocessing\\connection.py", line 337, in _get_more_data assert left > 0 AssertionError文件“C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\threading.py”,第 914 行,在 _bootstrap_inner self.run() 文件“C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\threading .py”,第 862 行,运行 self._target(*self._args, **self._kwargs) 文件“C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\multiprocessing\\pool.py”,第 463 行, 在 _handle_results task = get() File "C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\multiprocessing\\connection.py", line 250, in recv buf = self._recv_bytes() File "C:\\Users \\roman\\Anaconda3\\envs\\tsc1\\lib\\multiprocessing\\connection.py”,第 318 行,在 _recv_bytes 中返回 self._get_more_data(ov, maxsize) 文件“C:\\Users\\roman\\Anaconda3\\envs\\tsc1\\lib\\multiprocessing \\connection.py”,第 337 行,在 _get_more_data 中 assert left > 0 AssertionError

I monitor the resources closely: Memory is not an issue, I still have plenty left when the error occurs.我密切监视资源:内存不是问题,发生错误时我还有很多剩余。 The unzipped files are just plain multidimensional numpy arrays.解压后的文件只是普通的多维 numpy 数组。 Individually, using a Pool with a simpler method works, and loading the file like that works.单独使用具有更简单方法的 Pool 有效,并像这样加载文件。 Only in combination it fails.只有组合起来才会失败。 (All this happens in a custom keras generator. I doubt this helps but who knows.) Python 3.5. (所有这些都发生在自定义 keras 生成器中。我怀疑这是否有帮助,但谁知道呢。)Python 3.5。

What could the cause of this issue be?这个问题的原因可能是什么? How can this error be interpreted?如何解释这个错误?

Thank you for your help!谢谢您的帮助!

There is a bug in Python C core code that prevents data responses bigger than 2GB return correctly to the main thread. Python C 核心代码中存在一个错误,该错误会阻止大于 2GB 的数据响应正确返回到主线程。 you need to either split the data into smaller chunks as suggested in the previous answer or not use multiprocessing for this function您需要按照上一个答案中的建议将数据拆分为更小的块,或者不为此功能使用多处理

I reported this bug to python bugs list ( https://bugs.python.org/issue34563 ) and created a PR ( https://github.com/python/cpython/pull/9027 ) to fix it, but it probably will take a while to get it released ( UPDATE: the fix is present in python 3.8.0+)我将此错误报告给了 python 错误列表 ( https://bugs.python.org/issue34563 ) 并创建了一个 PR ( https://github.com/python/cpython/pull/9027 ) 来修复它,但它可能会需要一段时间才能发布它(更新: python 3.8.0+ 中存在修复程序)

if you are interested you can find more details on what causes the bug in the bug description in the link I posted如果您有兴趣,可以在我发布的链接中的错误描述中找到有关导致错误的原因的更多详细信息

It think I've found a workaround by retrieving data in small chunks.它认为我通过检索小块数据找到了一种解决方法。 In my case it was a list of lists.就我而言,它是一个列表列表。

I had:我有:

for i in range(0, NUMBER_OF_THREADS):
    print('MAIN: Getting data from process ' + str(i) + ' proxy...')
    X_train.extend(ListasX[i]._getvalue())
    Y_train.extend(ListasY[i]._getvalue())
    ListasX[i] = None
    ListasY[i] = None
    gc.collect()

Changed to:改为:

CHUNK_SIZE = 1024
for i in range(0, NUMBER_OF_THREADS):
    print('MAIN: Getting data from process ' + str(i) + ' proxy...')
    for k in range(0, len(ListasX[i]), CHUNK_SIZE):
        X_train.extend(ListasX[i][k:k+CHUNK_SIZE])
        Y_train.extend(ListasY[i][k:k+CHUNK_SIZE])
    ListasX[i] = None
    ListasY[i] = None
    gc.collect()

And now it seems to work, possibly by serializing less data at a time.现在它似乎起作用了,可能是一次序列化更少的数据。 So maybe if you can segment your data into smaller portions you can overcome the issue.因此,也许如果您可以将数据分割成更小的部分,您就可以解决这个问题。 Good luck!祝你好运!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM