[英]How can I improve the speed of this list iteration in Python 3.9.0?
The following code is a measured hot spot, distilled from some code I'm writing.以下代码是从我正在编写的一些代码中提炼出来的经过测量的热点。 I'm trying to figure out how to speed up this loop in Python 3.9.0.
我试图弄清楚如何在 Python 3.9.0 中加速这个循环。 I measure the same loop to be >30x faster using
std::vector
in VC++ 2019.我在 VC++ 2019 中使用
std::vector
测量相同的循环速度超过 30 倍。
As you can see, I've tried a few different methods.如您所见,我尝试了几种不同的方法。 The
map()
function appears to return an iterator, so I converted it to a list to measure the full cost of execution. map()
函数似乎返回一个迭代器,所以我将它转换为一个列表来衡量执行的全部成本。
I feel this is a fairly natural way to represent my data.我觉得这是表示我的数据的一种相当自然的方式。 I could certainly work on some representational or algorithmic improvements here.
我当然可以在这里进行一些代表性或算法改进。 However, I'm sort of surprised that iteration is so slow in this case, and I'd like to see if it can be improved, first.
但是,我有点惊讶在这种情况下迭代如此之慢,我想先看看它是否可以改进。
Performance results from executing: python listIteration.py
执行的性能结果:
python listIteration.py
Iteration by index
66.66 ms
60.90 ms
62.74 ms
Total: 124998250000
Iteration by index -- just integers
55.22 ms
55.27 ms
80.84 ms
Total: 124998250000
Iteration by object
56.48 ms
60.30 ms
55.77 ms
Total: 124998250000
List comprehension
235.34 ms
328.15 ms
272.47 ms
Total: 124998250000
Map
310.81 ms
353.87 ms
300.27 ms
Total: 124998250000
Code:代码:
import time
def makeList():
data = []
for i in range(500000):
data.append([i, i, i])
return data
def makeListOfInts():
data = []
for i in range(500000):
data.append(i)
return data
def dumpTime(delta):
print("{:.2f}".format(1000.0*delta) + " ms")
NUM_TRIALS = 3
print("Iteration by index");
data = makeList()
for t in range(NUM_TRIALS):
x1 = time.perf_counter()
for j in range(len(data)):
data[j][0] -= 1
x2 = time.perf_counter()
dumpTime(x2-x1)
total = sum([x[0] for x in data])
print("Total: "+ str(total))
print("Iteration by index -- just integers");
data = makeListOfInts()
for t in range(NUM_TRIALS):
x1 = time.perf_counter()
for j in range(len(data)):
data[j] -= 1
x2 = time.perf_counter()
dumpTime(x2-x1)
total = sum(data)
print("Total: "+ str(total))
print("Iteration by object");
data = makeList()
for t in range(NUM_TRIALS):
x1 = time.perf_counter()
for v in data:
v[0] -= 1
x2 = time.perf_counter()
dumpTime(x2-x1)
total = sum([x[0] for x in data])
print("Total: "+ str(total))
print("List comprehension");
data = makeList()
for t in range(NUM_TRIALS):
x1 = time.perf_counter()
data = [[x[0]-1, x[1], x[2]] for x in data]
x2 = time.perf_counter()
dumpTime(x2-x1)
total = sum([x[0] for x in data])
print("Total: "+ str(total))
print("Map");
data = makeList()
for t in range(NUM_TRIALS):
x1 = time.perf_counter()
# here we convert the map object to a list, because apparently
# map() returns an iterator, and we want to measure the full cost
# of the computation
data = list(map(lambda x: [x[0]-1, x[1], x[2]], data))
x2 = time.perf_counter()
dumpTime(x2-x1)
total = sum([x[0] for x in data])
print("Total: "+ str(total))
Python code is going to be slower than C++. Python 代码会比 C++ 慢。 No way around it, unless you eliminate / outsource the iteration to a C-backend, which is what
numpy
does.没有办法解决它,除非您将迭代消除/外包给 C 后端,这就是
numpy
所做的。
For example, you could do例如,你可以做
import numpy as np
def makeArray():
data = np.vstack((np.arange(500000), np.arange(500000), np.arange(500000))).T
return data
def makeArrayOfInts():
data = np.arange(500000)
return data
And then, you wouldn't need to iterate at all.然后,您根本不需要迭代。
data = makeArray()
for t in range(NUM_TRIALS):
x1 = time.perf_counter()
data[:, 0] = data[:, 0] - 1
x2 = time.perf_counter()
dumpTime(x2-x1)
total = sum(data[:, 0])
print("Total: "+ str(total))
data = makeArrayOfInts()
for t in range(NUM_TRIALS):
x1 = time.perf_counter()
data = data - 1
x2 = time.perf_counter()
dumpTime(x2-x1)
total = sum(data)
print("Total: "+ str(total))
Both these are superfast : the trials each take ~1ms, as opposed to ~50ms that is needed for iterating over the lists.这两个都是超快的:每个试验需要大约 1 毫秒,而不是迭代列表需要大约 50 毫秒。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.