简体   繁体   中英

for loop to list comprehension or map in python

I'm trying to improve the speed of some python code a bit and therefore trying to move a standard for loop to either a list comprehension or map call:

    buf = [0 for i in range(self.numLEDs * 3)]
    temp = [0,0,0]
    for x in range(self.numLEDs):
        r = data[x*3]
        g = data[x*3+1]
        b = data[x*3+2]
        temp[self.c_order[0]] = self.gamma[r]
        temp[self.c_order[1]] = self.gamma[g]
        temp[self.c_order[2]] = self.gamma[b]

        buf[x * 3:x * 3 + 3] = temp

c_order is simply another list, in this case [1,2,0]. It controls the channel order for some RGB pixels. gamma is a list 256 elements long that holds gamma corrected values for each of the 8bit channel values.

What'd I'd like to do is somehow completely remove any use of a standard for loop from this bit of code. I've managed to do it without the channel swap, but with the gamma correction and it's twice as fast. Like this:

corrected = [gamma[i] for i in data]
buf[0:len(corrected)] = corrected

How can I swap the order of list elements as I go without a for loop though?

You can have everything done in numpy in a few lines and slightly faster:

In [69]:

gamma=list(np.random.rand(256))
numLEDs=10
data=list(np.random.randint(0,256,30))
c_order=[0,1,2]
In [70]:

%%timeit 
buf = [0 for i in range(numLEDs * 3)]
temp = [0,0,0]
for x in range(numLEDs):
    r = data[x*3]
    g = data[x*3+1]
    b = data[x*3+2]
    temp[c_order[0]] = gamma[r]
    temp[c_order[1]] = gamma[g]
    temp[c_order[2]] = gamma[b]
    buf[x * 3:x * 3 + 3] = temp
10000 loops, best of 3: 47.3 µs per loop
In [85]:

gamma=np.array(gamma)
data=np.array(data)

In [86]:

%%timeit
data_array=data.reshape(3, -1, order='F')
np.take(gamma[data_array], c_order, axis=0).ravel(order='F')
10000 loops, best of 3: 38.3 µs per loop

When you have a lot of LED's, the numpy version will be much faster than the loop version:

In [98]:

gamma=list(np.random.rand(256))
numLEDs=1000
data=list(np.random.randint(0,256,3000))
c_order=[0,1,2]
In [99]:

%%timeit 
buf = [0 for i in range(numLEDs * 3)]
temp = [0,0,0]
for x in range(numLEDs):
    r = data[x*3]
    g = data[x*3+1]
    b = data[x*3+2]
    temp[c_order[0]] = gamma[r]
    temp[c_order[1]] = gamma[g]
    temp[c_order[2]] = gamma[b]
    buf[x * 3:x * 3 + 3] = temp
100 loops, best of 3: 4.08 ms per loop
In [100]:

gamma=np.array(gamma)
data=np.array(data)

In [101]:

%%timeit
data_array=data.reshape(3, -1, order='F')
np.take(gamma[data_array], c_order, axis=0).ravel(order='F')
1000 loops, best of 3: 244 µs per loop

So you need pure python code without any extension library.

To speedup the code:

  1. use local variable in loops.
  2. change for loop to list comprehension.

Here is the code:

class Test(object):

    def __init__(self, n):
        self.numLEDs =  n
        self.c_order = [1, 2, 0]
        self.gamma = [i // 2 for i in range(256)]

    def do1(self, data):
        buf = [0 for i in range(self.numLEDs * 3)]
        temp = [0,0,0]
        for x in range(self.numLEDs):
            r = data[x*3]
            g = data[x*3+1]
            b = data[x*3+2]
            temp[self.c_order[0]] = self.gamma[r]
            temp[self.c_order[1]] = self.gamma[g]
            temp[self.c_order[2]] = self.gamma[b]

            buf[x * 3:x * 3 + 3] = temp
        return buf

    def do2(self, data):
        buf = [0] * (self.numLEDs * 3)
        gamma = self.gamma
        for idx, idx2 in enumerate(self.c_order):
            buf[idx2::3] = [gamma[v] for v in data[idx::3]]
        return buf

import random
random.seed(0)
N = 1000
t = Test(N)
data = [random.randint(0, 255) for i in range(3*N)]
r1 = t.do1(data)
r2 = t.do2(data)
print r1 == r2  # check the result

%timeit t.do1(data)
%timeit t.do2(data)

the output, it's 6x faster:

True
1000 loops, best of 3: 1.1 ms per loop
10000 loops, best of 3: 176 µs per loop

Contrary to popular belief, calling a map function will not give you significant speedup. You may actually see worse performance.

Depending on how long you spend in this section of code, this may be the perfect situation where simply porting this loop to C makes sense. See here .

Make sure that you're actually spending a lot of time in this for-loop, otherwise the overhead of calling your C code will outweigh any potential performance gains.

Read here for some potential alternatives if you decide to use to port this code to C:

  1. ctypes vs C extension
  2. Wrapping a C library in Python: C, Cython or ctypes?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM