简体   繁体   English

如何在 Python 3.9.0 中提高此列表迭代的速度?

[英]How can I improve the speed of this list iteration in Python 3.9.0?

The following code is a measured hot spot, distilled from some code I'm writing.以下代码是从我正在编写的一些代码中提炼出来的经过测量的热点。 I'm trying to figure out how to speed up this loop in Python 3.9.0.我试图弄清楚如何在 Python 3.9.0 中加速这个循环。 I measure the same loop to be >30x faster using std::vector in VC++ 2019.我在 VC++ 2019 中使用std::vector测量相同的循环速度超过 30 倍。

As you can see, I've tried a few different methods.如您所见,我尝试了几种不同的方法。 The map() function appears to return an iterator, so I converted it to a list to measure the full cost of execution. map()函数似乎返回一个迭代器,所以我将它转换为一个列表来衡量执行的全部成本。

I feel this is a fairly natural way to represent my data.我觉得这是表示我的数据的一种相当自然的方式。 I could certainly work on some representational or algorithmic improvements here.我当然可以在这里进行一些代表性或算法改进。 However, I'm sort of surprised that iteration is so slow in this case, and I'd like to see if it can be improved, first.但是,我有点惊讶在这种情况下迭代如此之慢,我想先看看它是否可以改进。

Performance results from executing: python listIteration.py执行的性能结果: python listIteration.py

Iteration by index
66.66 ms
60.90 ms
62.74 ms
Total: 124998250000
Iteration by index -- just integers
55.22 ms
55.27 ms
80.84 ms
Total: 124998250000
Iteration by object
56.48 ms
60.30 ms
55.77 ms
Total: 124998250000
List comprehension
235.34 ms
328.15 ms
272.47 ms
Total: 124998250000
Map
310.81 ms
353.87 ms
300.27 ms
Total: 124998250000

Code:代码:

import time

def makeList():
    data = []
    for i in range(500000):
        data.append([i, i, i])
    return data

def makeListOfInts():
    data = []
    for i in range(500000):
        data.append(i)
    return data    

def dumpTime(delta):
    print("{:.2f}".format(1000.0*delta) + " ms")


NUM_TRIALS = 3

print("Iteration by index");
data = makeList()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    
    for j in range(len(data)):
        data[j][0] -= 1

    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum([x[0] for x in data])
print("Total: "+ str(total))

print("Iteration by index -- just integers");
data = makeListOfInts()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    
    for j in range(len(data)):
        data[j] -= 1

    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum(data)
print("Total: "+ str(total))

print("Iteration by object");
data = makeList()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    
    for v in data:
        v[0] -= 1

    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum([x[0] for x in data])    
print("Total: "+ str(total))

print("List comprehension");
data = makeList()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    
    data = [[x[0]-1, x[1], x[2]] for x in data]
    
    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum([x[0] for x in data])
print("Total: "+ str(total))    

print("Map");
data = makeList()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    
    # here we convert the map object to a list, because apparently
    # map() returns an iterator, and we want to measure the full cost
    # of the computation
    data = list(map(lambda x: [x[0]-1, x[1], x[2]], data))
    
    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum([x[0] for x in data])
print("Total: "+ str(total))    

Python code is going to be slower than C++. Python 代码会比 C++ 慢。 No way around it, unless you eliminate / outsource the iteration to a C-backend, which is what numpy does.没有办法解决它,除非您将迭代消除/外包给 C 后端,这就是numpy所做的。

For example, you could do例如,你可以做

import numpy as np

def makeArray():
    data = np.vstack((np.arange(500000), np.arange(500000), np.arange(500000))).T
    return data

def makeArrayOfInts():
    data = np.arange(500000)
    return data

And then, you wouldn't need to iterate at all.然后,您根本不需要迭代。

data = makeArray()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    data[:, 0] = data[:, 0] - 1
    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum(data[:, 0])
print("Total: "+ str(total))

data = makeArrayOfInts()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    data = data - 1
    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum(data)
print("Total: "+ str(total))

Both these are superfast : the trials each take ~1ms, as opposed to ~50ms that is needed for iterating over the lists.这两个都是超快的:每个试验需要大约 1 毫秒,而不是迭代列表需要大约 50 毫秒。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM