简体   繁体   English

多处理比使用Python的顺序慢

[英]Multiprocessing slower than sequential with python

I've invested quite a lot of time in rewriting my code to exploit more cores, but when I benchmarked it I found that all I had achieved was making it 7 times slower than the original code, despite running on 16 cores rather than one! 我花了很多时间来重写代码以利用更多的内核,但是当我对其进行基准测试时,我发现尽管运行在16个内核而不是一个内核上,但我实现的速度却比原始代码慢7倍! This leads me to believe that I must be doing something wrong. 这使我相信我一定做错了什么。

The code is 4000+ lines and needs a number of pretty heavy input files, so I'm not going to be able to post something that reproduces the problem. 代码是4000多个行,需要大量的沉重输入文件,因此我将无法发布可重现该问题的内容。 However, I can say that the function that I'm calling typically takes 0.1s to run and calls some c libraries using ctypes. 但是,我可以说我正在调用的函数通常需要0.1秒才能运行,并使用ctypes调用某些c库。 It is also passed a fair amount of data in memory - maybe 1 MB? 它还会在内存中传递大量数据-也许1 MB? Some pseudo code that looks like the slow bit: 看起来像慢速位的一些伪代码:

    def AnalyseSection(Args):
        Sectionsi,SectionNo,ElLoads,ElLoadsM,Materials,CycleCount,FlapF,EdgeF,Scaling,Time,FlapFreq,EdgeFreq=Args
        for i in range(len(Sections[Elements])):
           #Do some heavy lifting with ctypes
        return Result

     for i in range(10):
         for j in range(10):
             for k in range(10):
                 Args=[(Sections[i],SectionList[i],ElLoads,ElLoadsM,Materials,CycleCount,FlapF,EdgeF,Scaling,Time,FlapFreq,EdgeFreq) for n in SectionList]
                 pool=mp.Pool(processes=NoCPUs,maxtasksperchild=1)
                 result = pool.map(AnalyseSection,Args)
                 pool.close()
                 pool.join()

I was hoping someone could spot an obvious error that's causing it to run so much more slowly? 我希望有人能发现一个明显的错误,导致它运行得慢得多? The function takes a while to run (0.1s for each call typically) so I'd not think that the overhead associated with multiprocessing could slow it down so much. 该函数需要一段时间才能运行(通常每个调用为0.1秒),因此我认为与多处理相关的开销不会使它这么慢。 Any help will be much appreciated! 任何帮助都感激不尽!

This 这个

 for i in range(10):
     for j in range(10):
         for k in range(10):
             Args=[(Sections[i],SectionList[i],ElLoads,ElLoadsM,Materials,CycleCount,FlapF,EdgeF,Scaling,Time,FlapFreq,EdgeFreq) for n in SectionList]
             pool=mp.Pool(processes=NoCPUs,maxtasksperchild=1)
             result = pool.map(AnalyseSection,Args)
             pool.close()
             pool.join()

can and should be transformed to this 可以而且应该转变为此

 pool=mp.Pool(processes=NoCPUs)

 for i in range(10):
     for j in range(10):
         for k in range(10):
             Args=[(Sections[i],SectionList[i],ElLoads,ElLoadsM,Materials,CycleCount,FlapF,EdgeF,Scaling,Time,FlapFreq,EdgeFreq) for n in SectionList]
             result = pool.map(AnalyseSection,Args)

 pool.join()

This is more in line with what you are trying to achieve. 这更符合您要实现的目标。 You have a multiprocessing Pool where you feed the data and wait for the results. 您有一个多处理池,您可以在其中馈送数据并等待结果。 You don't have to start/stop the pool in each iteration. 您不必在每次迭代中启动/停止池。

Keep in mind that there is a cost associated with starting processed (much bigger than threads, if you are used to threads). 请记住,与开始处理相关的成本(如果您习惯于线程,则比线程要大得多)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM