简体   繁体   English

加速 Python 中的求和

[英]Speeding up summation in Python

I have a large list with numbers n1 in listn1 .我在listn1有一个包含数字n1的大列表。 I want to add all multiples of each number, if the multiplicand is not in another list listn2 (prime numbers with specific characteristics) and the product is below a maximum max_n .我想添加每个数字的所有倍数,如果被乘数不在另一个列表listn2 (具有特定特征的素数)并且乘积低于最大值max_n The multiplicand is in most cases below 10, but can go up to 100,000.在大多数情况下,被乘数小于 10,但可以达到 100,000。 My code so far looks like:到目前为止,我的代码如下所示:

s = 0
for n1 in listn1:
    s += sum(n1 * i for i in range(1, 1 + max_n // n1) if i not in listn2)

The problem: this approach is sloooow.问题:这种方法太慢了。 It takes seconds to calculate listn1 and listn2 , so I know that there are more than a million numbers to add.计算listn1listn2需要几秒钟,所以我知道有超过一百万个数字要添加。 But I started the summation yesterday evening and it is still running this morning.但是我昨天晚上开始求和,今天早上它仍在运行。

Is there a Pythonic way to speed up this summation?有没有一种 Pythonic 的方法来加速这个求和?

I have 2 suggestions for you.我有2个建议给你。

First of all , you don't have multiply i with n1 at each iteration.首先,您没有在每次迭代时将in1相乘。 You can replace你可以更换

s += sum(n1 * i for i in range(1, 1 + max_n // n1) if i not in listn2)

with

s += n1 * sum(i for i in range(1, 1 + max_n // n1) if i not in listn2)

They are totally same.他们完全一样。

Secondly , without if i not in listn2 condition, you have a simple summation:其次if i not in listn2条件下,你有一个简单的总结:

sum(i for i in range(1, 1 + max_n // n1)

This is same with sum([1, 2, 3, 4, 5, 6, 7, 8, ..., (max_n // n1)]) , and equals (max_n // n1) * (1 + max_n // n1) / 2 .这与sum([1, 2, 3, 4, 5, 6, 7, 8, ..., (max_n // n1)]) ,并且等于(max_n // n1) * (1 + max_n // n1) / 2 For a simply example, take a look at this .举个简单的例子,看看这个

To handle if i not in listn2 condition, if your listn2 is smaller, you can sum listn2 instead of listn1 .要处理if i not in listn2条件下,如果您的listn2较小,您可以总结listn2而不是listn1

So , find the sum of listn1 and subtract the items in listn2 :因此,找到的总和listn1核减的项目listn2

def sum_until(l, max):
    return sum([x for x in l if x < max])

listn2 = list(set(listn2))

for n1 in listn1:
    finish = max_n // n1
    s += n1 * (finish * (finish + 1) / 2 - sum_until(listn2, finish)) 

EDIT:编辑:

I guess NumPy would be faster for sum.我想 NumPy 的总和会更快。 Make listn2 a numpy array:使listn2成为一个 numpy 数组:

import numpy as np

listn2 = np.array(list(set(listn2))) 

And use this sum_until function:并使用这个sum_until函数:

def sum_until(listn2, max):
    l = listn2[np.where(listn2 <= max)]
    return int(np.sum(l))

Taking the suggestions on board, I re-wrote the code to根据船上的建议,我将代码重新编写为

setn2 = set(listn2)
s = 0
for n1 in listn1:
    s += n1 * (max_n * (max_n + 1) // 2 - sum(i for i in range(1, 1 + max_n // n1) if i in setn2))

Instead of hours, the summation now takes seconds.求和现在只需要几秒钟,而不是几小时。

Several hours later几个小时后

Coming across this old question of mine, it turned out that retrospectively, I did the right thing.遇到我的这个老问题,回想起来,我做了正确的事情。 The idea mentioned here in this thread to use numpy.sum() instead was well intended but wrong, as shown here .这个想法在这个线程提到这里使用numpy.sum()而不是良好意图,但错了,如图所示这里 If you work in numpy, fine, but if you have a Python list, use comprehensions.如果您使用 numpy,那很好,但是如果您有 Python 列表,请使用推导式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM