简体   繁体   中英

Python performance iterating through lists, numpy arrays

I am working on a project where I have to loop through large arrays (lists), accessing each item by index. Usually this involves checking each element against a condition, and thereafter potentially updating its value.

I've noticed that this is extremely slow compared to, for example, doing a similar thing in C#. Here is a sample of simply looping through arrays and reassigning each value:

C#:

var a = new double[10000000];
var watch = System.Diagnostics.Stopwatch.StartNew();

for (int i = 0; i < a.Length; i++)
{
     a[i] = 1.0;
}      
watch.Stop();
var elapsedMs = watch.ElapsedMilliseconds;
//About 40ms

Python:

a = []
for i in range(0, 10000000):
    a.append(0.0)

t1 = time.clock()

for i in range(0, 10000000):
    a[i] = 1.0

t2 = time.clock()
totalTime = t2-t1
//About 900ms 

The python code here seems to be over 20x slower. I am relatively new to python so I cannot judge if this kind of performance is "normal", or if I am doing something horribly wrong here. I am running Anaconda as my python environment, PyCharm is my IDE.

Note: I have tried using nditer on numpy arrays with no significant performance increase.

Many thanks in advance for any tips!

UPDATE: I've just compared the following two approaches:

#timeit: 43ms
a = np.random.randn(1000,1000)
a[a < 0] = 100.0

#timeit: 1650ms
a = np.random.randn(1000,1000)
for x in np.nditer(a, op_flags=['readwrite']):
    if (x < 0):
        x[...] = 100.0

looks like the first (vectorized) approach is the way to go here...

If you are using numpy you should use the numpy array type and then take advantage of numpy functions and broadcasting:

If your specific need is to assign 1.0 to all elements, there is a specific function for that in numpy :

import numpy as np
a = np.ones(10_000_000)

For a somewhat more general approach, where you first define the array and then assign all the different elements:

import numpy as np
a = np.empty(10_000_000)
a[:] = 1.0 # This uses the broadcasting

By default, the np.array is of type double so all elements will be double s.

Also, as a side note, using time for performance measurement is not optimal, since that will use the "wall clock time" and can be heavily affected by other programs running in the computer. Consider looking into the timeit module.

Python has a bunch of great ways to iterate on structures. There is good chance, if you find yourself using what I call C style looping:

for i in range(0, 10000000):
    a = [1.0] * 10000000

... you are likely doing it wrong.

In my quick testing this is 40x faster than the above:

a = [1.0] * 10000000

Even a basic list comprehension is 3x faster:

a = [1.0 for i in range(0, 10000000)]

And as mentioned in comments, cython , numba and numpy , can all be used to provide various speedups for this sort of work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM