[英]Python multiprocessing pool.map raises IndexError
I've developed a utility using python/cython that sorts CSV files and generates stats for a client, but invoking pool.map seems to raise an exception before my mapped function has a chance to execute. 我已经使用python / cython开发了一个实用程序来对CSV文件进行排序并为客户端生成统计信息,但调用pool.map似乎会在我的映射函数有机会执行之前引发异常。 Sorting a small number of files seems to function as expected, but as the number of files grows to say 10, I get the below IndexError after calling pool.map.
排序少量文件似乎按预期运行,但随着文件数量增加到10,我在调用pool.map后得到以下IndexError。 Does anyone happen to recognize the below error?
有没有人碰巧认出以下错误? Any help is greatly appreciated.
任何帮助是极大的赞赏。
While the code is under NDA, the use-case is fairly simple: 虽然代码在NDA下,但用例非常简单:
Code Sample: 代码示例:
def sort_files(csv_files):
pool_size = multiprocessing.cpu_count()
pool = multiprocessing.Pool(processes=pool_size)
sorted_dicts = pool.map(sort_file, csv_files, 1)
return sorted_dicts
def sort_file(csv_file):
print 'sorting %s...' % csv_file
# sort code
Output: 输出:
File "generic.pyx", line 17, in generic.sort_files (/users/cyounker/.pyxbld/temp.linux-x86_64-2.7/pyrex/generic.c:1723)
sorted_dicts = pool.map(sort_file, csv_files, 1)
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 227, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 528, in get
raise self._value
IndexError: list index out of range
The IndexError is an error you get somewhere in sort_file(), ie in a subprocess. IndexError是您在sort_file()中的某个地方出现的错误,即在子进程中。 It is re-raised by the parent process.
它由父进程重新引发。 Apparently
multiprocessing
doesn't make any attempt to inform us about where the error really comes from (eg on which lines it occurred) or even just what argument to sort_file() caused it. 显然,
multiprocessing
不会尝试通知我们错误的真正来源(例如,它出现在哪一行上),甚至是sort_file()的哪个参数引起它。 I hate multiprocessing
even more :-( 我讨厌
multiprocessing
:-(
Check further up in the command output. 在命令输出中进一步检查。 In Python 3.4 at least,
multiprocessing.pool
will helpfully print a RemoteTraceback
above the parent process traceback. 至少在Python 3.4中,
multiprocessing.pool
将在父进程回溯之上RemoteTraceback
打印RemoteTraceback
。 You'll see something like: 你会看到类似的东西:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.4/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/path/to/your/code/here.py", line 80, in sort_file
something = row[index]
IndexError: list index out of range
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "generic.pyx", line 17, in generic.sort_files (/users/cyounker/.pyxbld/temp.linux-x86_64-2.7/pyrex/generic.c:1723)
sorted_dicts = pool.map(sort_file, csv_files, 1)
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 227, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 528, in get
raise self._value
IndexError: list index out of range
In the case above, the code raising the error is at /path/to/your/code/here.py", line 80
在上面的例子中,引发错误的代码位于
/path/to/your/code/here.py", line 80
see also debugging errors in python multiprocessing 另请参阅python多处理中的调试错误
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.