简体   繁体   English

python多处理慢

[英]python multiprocessing slow

I have some code that parallelizes calls to a function. 我有一些代码可以并行化对函数的调用。 Inside the function, I check if a file exists, if not then I create it, else I do nothing. 在函数内部,我检查文件是否存在,如果不存在,则创建它,否则不执行任何操作。

I find that if the files do exist, then calling multiprocessing.process has a fairly huge time penalty compared to a simple for loop. 我发现如果文件确实存在,则与简单的for循环相比,调用multiprocessing.process会耗费大量时间。 Is this expected or is there something I can do to reduce the penalty? 这是预期的还是我可以做些减少处罚的事情?

def fn():
    # Check if file exists, if yes then return else make the file
    if(not(os.path.isfile(fl))):
        # processing takes enough time to make the paralleization worth it
    else:
        print 'file exists'


pkg_num = 0
total_runs    = 2500
threads = []

while pkg_num < total_runs or len(threads):
    if(len(threads) < 3 and pkg_num < total_runs):
        t = multiprocessing.Process(target=fn,args=[])
        pkg_num = pkg_num + 1
        t.start()
        threads.append(t)
    else:
        for thread in threads:
            if not thread.is_alive():
                threads.remove(thread)

There's a fair bit of overhead to bringing up processes -- you've got to weigh the overhead of creating those processes against the performance benefits that you'll gain from making the tasks concurrent. 启动流程会产生相当大的开销-您必须权衡创建这些流程的开销和通过使任务并发所获得的性能优势。 I'm not sure that there's enough of a benefit for a simple OS call for it to be worthwhile. 我不确定简单的OS调用是否有足够的优势值得它。

Also, for the sake of future generations, you should really check out concurrent.futures.ProcessPoolExecutor ; 另外,为了子孙后代,您应该真正检查出current.comture.futures.ProcessPoolExecutor;。 way, way cleaner. 方式,方式更清洁。 If you use 2.7, you can back port it. 如果使用2.7,则可以反向移植它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM