简体   繁体   English

Pool.imap_unordered 从可迭代对象中跳过值

[英]Pool.imap_unordered skips value from the iterable

I am trying to run the following code to parallalize a function that crops geotifs.我正在尝试运行以下代码来并行化裁剪 geotifs 的 function。 Geotifs are named as <location>__img_news1a_iw_rt30_<hex_code>_g_gpf_vv.tif . Geotif 被命名为<location>__img_news1a_iw_rt30_<hex_code>_g_gpf_vv.tif The code works perfectly fine but it skips a particular set of geotif from even reading from the vv_tif iterable.该代码工作得非常好,但它甚至从 vv_tif 可迭代的读取中跳过了一组特定的 geotif。 In particular, out of locationA_img_news1a_iw_rt30_20170314t115609_g_gpf_vv.tif , locationA_img_news1a_iw_rt30_20170606t115613_g_gpf_vv.tif and locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif it skips locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif every single time from reading when I read these files along with other location geotifs. In particular, out of locationA_img_news1a_iw_rt30_20170314t115609_g_gpf_vv.tif , locationA_img_news1a_iw_rt30_20170606t115613_g_gpf_vv.tif and locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif it skips locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif every single time from reading when I read these files along with other location geotifs. However, it reads this file if I create an iterable from only these three geotif files.但是,如果我只从这三个 geotif 文件创建一个可迭代对象,它就会读取这个文件。 I have tried changing chunksize but it doesn't help.我曾尝试更改块大小,但没有帮助。 Am I missing something here?我在这里错过了什么吗?

from multiprocessing import Pool, cpu_count
try:
    pool = Pool(cpu_count())
    pool.imap_unordered(tile_geotif, vv_tif, chunksize=11)
finally:
    pool.close()

EDIT: I have 55 files in total and it only drops locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif file every single time.编辑:我总共有 55 个文件,它每次只删除locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif文件。

This is too much to show in comments, putting here in answer.这太多了,无法在评论中显示,在这里回答。

It seems to me that the map functions work in my toy examples below.在我看来,map 函数在我下面的玩具示例中起作用。 I think you have error in your input data to cause the corrupted output.我认为您的输入数据有错误导致 output 损坏。 Either that, or you found a bug.要么,要么你发现了一个错误。 If so, do try to create a reproducible example.如果是这样,请尝试创建一个可重现的示例。

from multiprocessing import Pool

vv_tif = list(range(10))
def square(x):
    return x**x

with Pool(5) as p:
    print(p.map(square, vv_tif))

with Pool(5) as p:
    print(list(p.imap(square, vv_tif)))

with Pool(5) as p:
    print(list(p.imap_unordered(square, vv_tif)))

with Pool(5) as p:
    print(list(p.imap_unordered(square, vv_tif, chunksize=11)))

Output: Output:

[1, 1, 4, 27, 256, 3125, 46656, 823543, 16777216, 387420489]
[1, 1, 4, 27, 256, 3125, 46656, 823543, 16777216, 387420489]
[1, 1, 256, 3125, 46656, 823543, 16777216, 4, 27, 387420489]
[1, 1, 4, 27, 256, 3125, 46656, 823543, 16777216, 387420489]

Usually all 4 lines were the same.通常所有 4 行都是相同的。 I ran it a few times till I got a different ordering on one.我跑了几次,直到我得到一个不同的订单。 It looks to me that it works.在我看来它有效。

Note that his demonstrates that the various map functions are not mutating underlying data.请注意,他证明了各种map函数不会改变基础数据。

Please notice the difference in results depending on whether the "time.sleep" is in or out.请注意结果的差异取决于“time.sleep”是否进入。

import time
from multiprocessing import Pool

def process(x):
    print(x)

def main():
    pool = Pool(4)
    pool.imap_unordered(process, (1,2,3,4,5))
    pool.close()
    #time.sleep(3)

if __name__ == "__main__":
    main()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在pool.imap_unordered上进行迭代 - Iteration over pool.imap_unordered pool.imap_unordered 是否需要 memory 中的所有输入? - Does pool.imap_unordered requires all input in the memory? pool.imap_unordered() 和 pool.apply_async() 有什么区别? - What is the difference between pool.imap_unordered() and pool.apply_async()? 为什么使用 gevent.joinall() 而不是 pool.imap_unordered() 来运行 Greenlets? - Why Use gevent.joinall() Instead of pool.imap_unordered() to Run Greenlets? 如果iterable抛出错误,则imap_unordered()挂断 - imap_unordered() hangs up if iterable throws an error multiprocessing.Pool.imap_unordered与固定队列大小或缓冲区? - multiprocessing.Pool.imap_unordered with fixed queue size or buffer? 显示 Python 多处理池 imap_unordered 调用的进度? - Show the progress of a Python multiprocessing pool imap_unordered call? multiprocessing.Pool.imap_unordered 的内存使用量稳步增长 - Memory usage steadily growing for multiprocessing.Pool.imap_unordered Python 多处理 - 我们可以将 (itertools.islice) 可迭代对象直接传递给 pool.imap 而无需转换为列表吗? - Python Multiprocessing - Can we pass an (itertools.islice) iterable directly to pool.imap whithout converting to a list? 在什么情况下我们需要使用`multiprocessing.Pool.imap_unordered`? - In what situation do we need to use `multiprocessing.Pool.imap_unordered`?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM