简体   繁体   English

如何在python中同时计算平均值?

[英]How to calculate average concurrently in python?

I defined two correct ways of calculating averages in python. 我定义了两种在python中计算平均值的正确方法。

def avg_regular(values):
    total = 0
    for value in values:
        total += value
    return total/len(values)

def avg_concurrent(values):
    mean = 0
    num_of_values = len(values)
    for value in values:
        #calculate a small portion of the average for each num and add to the total
        mean += value/num_of_values  
    return mean

The first function is the regular way of calculating averages, but I wrote the second one because each run of the loop doesn't depend on previous runs. 第一个函数是计算平均值的常规方法,但是我写了第二个函数,因为循环的每次运行都不依赖于先前的运行。 So theoretically the average can be computed in parallel. 因此,理论上可以并行计算平均值。

However, the "parallel" one (without running in parallel) takes about 30% more time than the regular one. 但是,“并行”模式(不并行运行)所花费的时间比常规模式多30%。

Are my assumptions correct and worth the speed loss? 我的假设是正确的,值得进行速度损失吗? if yes how can I make the second function run the second one parrallely? 如果是,如何使第二个功能并行运行第二个功能?

if not, where did I go wrong? 如果没有,我哪里出了问题?

The code you implemented is basically the difference between (a1+a2+ ... + an) / n and (a1/n + a2/n + ... + an/n) . 您实现的代码基本上是(a1+a2+ ... + an) / n(a1/n + a2/n + ... + an/n) The result is the same, but in the second version there are more operations (namely (n-1) more divisions) which slows the calculation down. 结果是相同的,但是在第二个版本中,会有更多的运算(即(n-1)个除法)会减慢计算速度。 You claimed that in the second version each loop run is independent of the others. 您声称在第二个版本中,每个循环运行都独立于其他循环。 In the first loop we need the following information to finish one loop run: total before the run and the current value . 在第一个循环中,我们需要以下信息来完成一个循环运行:运行前的total和当前value In the second version we need the following information to finish one loop run: mean before the run, the current value and num_of_values . 在第二个版本中,我们需要以下信息来完成一次循环运行: mean在运行之前,当前valuenum_of_values As you see in the second version we even depend on more values! 如您在第二版中所见,我们甚至还依赖于更多的价值!

But how could we divide the work between cores (which is the goal of multiprocessing)? 但是,我们如何在内核之间分配工作(这是多处理的目标)? We could just give one core the first half of the values and the second the second half, ie ((a1+a2+ ... + a(n//2)) + ( a(n//2 +1) + ... + a(n)) / n) . 我们可以只给一个核提供上半部分的值,第二个给后半部分,即((a1+a2+ ... + a(n//2)) + ( a(n//2 +1) + ... + a(n)) / n) Yes, the work of dividing by n is not splitted between the cores, but it's a single instruction so we don't really care. 是的,除以n的工作不会在内核之间分配,但这是一条指令,因此我们并不在乎。 Also we need to add the left total and the right total, which we can't split, but again it's only a single operation. 另外,我们需要添加左总计和右总计,我们无法分割,但同样,这仅是一次操作。

So the code we want to run: 所以我们要运行的代码:

def my_sum(values):
    total = 0
    for value in values:
        total += value
    return total

There's still a problem with python - normally one could use threads to do the computations, because each thread will use one core. python仍然存在问题-通常每个人都可以使用线程来进行计算,因为每个线程将使用一个内核。 But in that case one has to take care that your program does not run into race conditions, and the python interpreter itself also needs to take care of that. 但是在那种情况下,必须注意您的程序不会遇到竞争条件,而python解释器本身也需要注意这一点。 CPython decided it's not worth it and basically only runs in one thread at a time. CPython认为这样做不值得,并且基本上一次只在一个线程中运行。 A basic solution is to use multiple processes via multiprocessing. 一个基本的解决方案是通过多重处理使用多个过程。

from multiprocessing import Pool

if __name__ == '__main__':

    with Pool(5) as p:
        results = p.map(my_sum, [long_list[0:len(long_list)//2], long_list[len(long_list)//2:]))

    print(sum(results) / len(long_list)) # add subresults and divide by n

But of course multiple processes do not come for free. 但是,当然,多个过程不是免费的。 You need to fork, copy stuff, etc. so you will not gain a speedup of 2 as one could expect. 您需要分叉,复制内容等,这样您将不会像预期的那样获得2的加速。 Also the biggest slowdown is actually using python itself, it's not really optimized for fast numerical computations. 同样,最大的减慢实际上是使用python本身,它并未真正针对快速数值计算进行优化。 There are various ways around that, but using numpy is probably the simplest. 有多种解决方法,但是使用numpy可能是最简单的方法。 Just use: 只需使用:

import numpy
print(numpy.mean(long_list))

Which is probably much faster than the python version. 这可能比python版本快得多。 I don't think numpy uses multiprocessing internal, so one could gain a boost by using multiple processes and a fast implementation (numpy or something other written in C) but normally numpy is fast enough. 我不认为numpy在内部使用多处理,因此可以通过使用多个进程和快速实现(numpy或其他用C语言编写的东西)来获得提升,但通常numpy足够快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM