[英]Pool.imap_unordered skips value from the iterable
I am trying to run the following code to parallalize a function that crops geotifs.我正在尝试运行以下代码来并行化裁剪 geotifs 的 function。 Geotifs are named as <location>__img_news1a_iw_rt30_<hex_code>_g_gpf_vv.tif
. Geotif 被命名为<location>__img_news1a_iw_rt30_<hex_code>_g_gpf_vv.tif
。 The code works perfectly fine but it skips a particular set of geotif from even reading from the vv_tif iterable.该代码工作得非常好,但它甚至从 vv_tif 可迭代的读取中跳过了一组特定的 geotif。 In particular, out of locationA_img_news1a_iw_rt30_20170314t115609_g_gpf_vv.tif
, locationA_img_news1a_iw_rt30_20170606t115613_g_gpf_vv.tif
and locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif
it skips locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif
every single time from reading when I read these files along with other location geotifs. In particular, out of locationA_img_news1a_iw_rt30_20170314t115609_g_gpf_vv.tif
, locationA_img_news1a_iw_rt30_20170606t115613_g_gpf_vv.tif
and locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif
it skips locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif
every single time from reading when I read these files along with other location geotifs. However, it reads this file if I create an iterable from only these three geotif files.但是,如果我只从这三个 geotif 文件创建一个可迭代对象,它就会读取这个文件。 I have tried changing chunksize but it doesn't help.我曾尝试更改块大小,但没有帮助。 Am I missing something here?我在这里错过了什么吗?
from multiprocessing import Pool, cpu_count
try:
pool = Pool(cpu_count())
pool.imap_unordered(tile_geotif, vv_tif, chunksize=11)
finally:
pool.close()
EDIT: I have 55 files in total and it only drops locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif
file every single time.编辑:我总共有 55 个文件,它每次只删除locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif
文件。
This is too much to show in comments, putting here in answer.这太多了,无法在评论中显示,在这里回答。
It seems to me that the map functions work in my toy examples below.在我看来,map 函数在我下面的玩具示例中起作用。 I think you have error in your input data to cause the corrupted output.我认为您的输入数据有错误导致 output 损坏。 Either that, or you found a bug.要么,要么你发现了一个错误。 If so, do try to create a reproducible example.如果是这样,请尝试创建一个可重现的示例。
from multiprocessing import Pool
vv_tif = list(range(10))
def square(x):
return x**x
with Pool(5) as p:
print(p.map(square, vv_tif))
with Pool(5) as p:
print(list(p.imap(square, vv_tif)))
with Pool(5) as p:
print(list(p.imap_unordered(square, vv_tif)))
with Pool(5) as p:
print(list(p.imap_unordered(square, vv_tif, chunksize=11)))
Output: Output:
[1, 1, 4, 27, 256, 3125, 46656, 823543, 16777216, 387420489]
[1, 1, 4, 27, 256, 3125, 46656, 823543, 16777216, 387420489]
[1, 1, 256, 3125, 46656, 823543, 16777216, 4, 27, 387420489]
[1, 1, 4, 27, 256, 3125, 46656, 823543, 16777216, 387420489]
Usually all 4 lines were the same.通常所有 4 行都是相同的。 I ran it a few times till I got a different ordering on one.我跑了几次,直到我得到一个不同的订单。 It looks to me that it works.在我看来它有效。
Note that his demonstrates that the various map
functions are not mutating underlying data.请注意,他证明了各种map
函数不会改变基础数据。
Please notice the difference in results depending on whether the "time.sleep" is in or out.请注意结果的差异取决于“time.sleep”是否进入。
import time
from multiprocessing import Pool
def process(x):
print(x)
def main():
pool = Pool(4)
pool.imap_unordered(process, (1,2,3,4,5))
pool.close()
#time.sleep(3)
if __name__ == "__main__":
main()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.