通过列表，numpy数组迭代的Python性能

Question

I am working on a project where I have to loop through large arrays (lists), accessing each item by index. 我正在一个项目中，我必须遍历大型数组（列表），并按索引访问每个项目。 Usually this involves checking each element against a condition, and thereafter potentially updating its value. 通常，这涉及根据条件检查每个元素，然后潜在地更新其值。

I've noticed that this is extremely slow compared to, for example, doing a similar thing in C#. 我已经注意到，与例如在C＃中执行类似操作相比，这非常慢。 Here is a sample of simply looping through arrays and reassigning each value: 这是简单遍历数组并重新分配每个值的示例：

C#: C＃：

var a = new double[10000000];
var watch = System.Diagnostics.Stopwatch.StartNew();

for (int i = 0; i < a.Length; i++)
{
     a[i] = 1.0;
}      
watch.Stop();
var elapsedMs = watch.ElapsedMilliseconds;
//About 40ms

Python: 蟒蛇：

a = []
for i in range(0, 10000000):
    a.append(0.0)

t1 = time.clock()

for i in range(0, 10000000):
    a[i] = 1.0

t2 = time.clock()
totalTime = t2-t1
//About 900ms

The python code here seems to be over 20x slower. 此处的python代码似乎慢了20倍。 I am relatively new to python so I cannot judge if this kind of performance is "normal", or if I am doing something horribly wrong here. 我是python的新手，所以我无法判断这种性能是否“正常”，或者我在这里做错了什么。 I am running Anaconda as my python environment, PyCharm is my IDE. 我将Anaconda作为我的python环境运行，PyCharm是我的IDE。

Note: I have tried using nditer on numpy arrays with no significant performance increase. 注意：我尝试在numpy数组上使用nditer ，但性能没有明显提高。

Many thanks in advance for any tips! 预先非常感谢您提供任何提示！

UPDATE: I've just compared the following two approaches: 更新：我刚刚比较了以下两种方法：

#timeit: 43ms
a = np.random.randn(1000,1000)
a[a < 0] = 100.0

#timeit: 1650ms
a = np.random.randn(1000,1000)
for x in np.nditer(a, op_flags=['readwrite']):
    if (x < 0):
        x[...] = 100.0

looks like the first (vectorized) approach is the way to go here... 看起来第一种（向量化）方法是解决问题的方法...

Answer 1

If you are using numpy you should use the numpy array type and then take advantage of numpy functions and broadcasting: 如果使用的是numpy ，则应使用numpy数组类型，然后利用numpy函数和广播：

If your specific need is to assign 1.0 to all elements, there is a specific function for that in numpy : 如果您的特定需要是将1.0分配给所有元素，那么numpy有一个特定的函数：

import numpy as np
a = np.ones(10_000_000)

For a somewhat more general approach, where you first define the array and then assign all the different elements: 对于更通用的方法，您首先定义数组，然后分配所有不同的元素：

import numpy as np
a = np.empty(10_000_000)
a[:] = 1.0 # This uses the broadcasting

By default, the np.array is of type double so all elements will be double s. 默认情况下， np.array的类型为double因此所有元素均为double 。

Also, as a side note, using time for performance measurement is not optimal, since that will use the "wall clock time" and can be heavily affected by other programs running in the computer. 另外，作为一个旁注，使用time进行性能测量也不是最佳选择，因为这将使用“挂钟时间”，并且可能会受到计算机中运行的其他程序的严重影响。 Consider looking into the timeit module. 考虑研究timeit模块。

Answer 2

Python has a bunch of great ways to iterate on structures. Python有很多迭代结构的好方法。 There is good chance, if you find yourself using what I call C style looping: 如果您发现自己使用了我所说的C样式循环，那么这是一个很好的机会：

for i in range(0, 10000000):
    a = [1.0] * 10000000

... you are likely doing it wrong. ...您可能做错了。

In my quick testing this is 40x faster than the above: 在我的快速测试中，这比上述速度快40倍：

a = [1.0] * 10000000

Even a basic list comprehension is 3x faster: 甚至基本的列表理解速度也快3倍：

a = [1.0 for i in range(0, 10000000)]

And as mentioned in comments, cython , numba and numpy , can all be used to provide various speedups for this sort of work. 如评论中所述， cython ， numba和numpy都可以用来为此类工作提供各种加速。

通过列表，numpy数组迭代的Python性能

问题描述

2 个解决方案

解决方案1
0 2018-03-18 15:07:51

解决方案2
-1 2018-03-18 15:16:25

通过列表，numpy数组迭代的Python性能

问题描述

2 个解决方案

解决方案1 0 2018-03-18 15:07:51

解决方案2 -1 2018-03-18 15:16:25

解决方案1
0 2018-03-18 15:07:51

解决方案2
-1 2018-03-18 15:16:25