I have a large list with numbers n1
in listn1
. I want to add all multiples of each number, if the multiplicand is not in another list listn2
(prime numbers with specific characteristics) and the product is below a maximum max_n
. The multiplicand is in most cases below 10, but can go up to 100,000. My code so far looks like:
s = 0
for n1 in listn1:
s += sum(n1 * i for i in range(1, 1 + max_n // n1) if i not in listn2)
The problem: this approach is sloooow. It takes seconds to calculate listn1
and listn2
, so I know that there are more than a million numbers to add. But I started the summation yesterday evening and it is still running this morning.
Is there a Pythonic way to speed up this summation?
I have 2 suggestions for you.
First of all , you don't have multiply i
with n1
at each iteration. You can replace
s += sum(n1 * i for i in range(1, 1 + max_n // n1) if i not in listn2)
with
s += n1 * sum(i for i in range(1, 1 + max_n // n1) if i not in listn2)
They are totally same.
Secondly , without if i not in listn2
condition, you have a simple summation:
sum(i for i in range(1, 1 + max_n // n1)
This is same with sum([1, 2, 3, 4, 5, 6, 7, 8, ..., (max_n // n1)])
, and equals (max_n // n1) * (1 + max_n // n1) / 2
. For a simply example, take a look at this .
To handle if i not in listn2
condition, if your listn2
is smaller, you can sum listn2
instead of listn1
.
So , find the sum of listn1
and subtract the items in listn2
:
def sum_until(l, max):
return sum([x for x in l if x < max])
listn2 = list(set(listn2))
for n1 in listn1:
finish = max_n // n1
s += n1 * (finish * (finish + 1) / 2 - sum_until(listn2, finish))
EDIT:
I guess NumPy would be faster for sum. Make listn2
a numpy array:
import numpy as np
listn2 = np.array(list(set(listn2)))
And use this sum_until
function:
def sum_until(listn2, max):
l = listn2[np.where(listn2 <= max)]
return int(np.sum(l))
Taking the suggestions on board, I re-wrote the code to
setn2 = set(listn2)
s = 0
for n1 in listn1:
s += n1 * (max_n * (max_n + 1) // 2 - sum(i for i in range(1, 1 + max_n // n1) if i in setn2))
Instead of hours, the summation now takes seconds.
Several hours later
Coming across this old question of mine, it turned out that retrospectively, I did the right thing. The idea mentioned here in this thread to use numpy.sum()
instead was well intended but wrong, as shown here . If you work in numpy, fine, but if you have a Python list, use comprehensions.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.