简体   繁体   English

for循环列出python中的理解或映射

[英]for loop to list comprehension or map in python

I'm trying to improve the speed of some python code a bit and therefore trying to move a standard for loop to either a list comprehension or map call: 我试图稍微提高一些python代码的速度,因此尝试将标准for循环移动到列表理解或map调用:

    buf = [0 for i in range(self.numLEDs * 3)]
    temp = [0,0,0]
    for x in range(self.numLEDs):
        r = data[x*3]
        g = data[x*3+1]
        b = data[x*3+2]
        temp[self.c_order[0]] = self.gamma[r]
        temp[self.c_order[1]] = self.gamma[g]
        temp[self.c_order[2]] = self.gamma[b]

        buf[x * 3:x * 3 + 3] = temp

c_order is simply another list, in this case [1,2,0]. c_order只是另一个列表,在这种情况下为[1,2,0]。 It controls the channel order for some RGB pixels. 它控制某些RGB像素的通道顺序。 gamma is a list 256 elements long that holds gamma corrected values for each of the 8bit channel values. gamma是一个256个元素长的列表,其中包含8位通道值中每个值的经过gamma校正的值。

What'd I'd like to do is somehow completely remove any use of a standard for loop from this bit of code. 我想做的就是以某种方式从这部分代码中完全删除对循环标准的使用。 I've managed to do it without the channel swap, but with the gamma correction and it's twice as fast. 我已经设法在没有通道交换的情况下做到了这一点,但是有了伽玛校正,它的速度快了一倍。 Like this: 像这样:

corrected = [gamma[i] for i in data]
buf[0:len(corrected)] = corrected

How can I swap the order of list elements as I go without a for loop though? 我如何在没有for循环的情况下交换列表元素的顺序?

You can have everything done in numpy in a few lines and slightly faster: 您可以在几行中以numpy完成所有工作,并且速度稍快一些:

In [69]:

gamma=list(np.random.rand(256))
numLEDs=10
data=list(np.random.randint(0,256,30))
c_order=[0,1,2]
In [70]:

%%timeit 
buf = [0 for i in range(numLEDs * 3)]
temp = [0,0,0]
for x in range(numLEDs):
    r = data[x*3]
    g = data[x*3+1]
    b = data[x*3+2]
    temp[c_order[0]] = gamma[r]
    temp[c_order[1]] = gamma[g]
    temp[c_order[2]] = gamma[b]
    buf[x * 3:x * 3 + 3] = temp
10000 loops, best of 3: 47.3 µs per loop
In [85]:

gamma=np.array(gamma)
data=np.array(data)

In [86]:

%%timeit
data_array=data.reshape(3, -1, order='F')
np.take(gamma[data_array], c_order, axis=0).ravel(order='F')
10000 loops, best of 3: 38.3 µs per loop

When you have a lot of LED's, the numpy version will be much faster than the loop version: 当您有很多LED时, numpy版本将比loop版本快得多:

In [98]:

gamma=list(np.random.rand(256))
numLEDs=1000
data=list(np.random.randint(0,256,3000))
c_order=[0,1,2]
In [99]:

%%timeit 
buf = [0 for i in range(numLEDs * 3)]
temp = [0,0,0]
for x in range(numLEDs):
    r = data[x*3]
    g = data[x*3+1]
    b = data[x*3+2]
    temp[c_order[0]] = gamma[r]
    temp[c_order[1]] = gamma[g]
    temp[c_order[2]] = gamma[b]
    buf[x * 3:x * 3 + 3] = temp
100 loops, best of 3: 4.08 ms per loop
In [100]:

gamma=np.array(gamma)
data=np.array(data)

In [101]:

%%timeit
data_array=data.reshape(3, -1, order='F')
np.take(gamma[data_array], c_order, axis=0).ravel(order='F')
1000 loops, best of 3: 244 µs per loop

So you need pure python code without any extension library. 因此,您需要没有任何扩展库的纯python代码。

To speedup the code: 要加速代码:

  1. use local variable in loops. 在循环中使用局部变量。
  2. change for loop to list comprehension. 更改循环以列出理解。

Here is the code: 这是代码:

class Test(object):

    def __init__(self, n):
        self.numLEDs =  n
        self.c_order = [1, 2, 0]
        self.gamma = [i // 2 for i in range(256)]

    def do1(self, data):
        buf = [0 for i in range(self.numLEDs * 3)]
        temp = [0,0,0]
        for x in range(self.numLEDs):
            r = data[x*3]
            g = data[x*3+1]
            b = data[x*3+2]
            temp[self.c_order[0]] = self.gamma[r]
            temp[self.c_order[1]] = self.gamma[g]
            temp[self.c_order[2]] = self.gamma[b]

            buf[x * 3:x * 3 + 3] = temp
        return buf

    def do2(self, data):
        buf = [0] * (self.numLEDs * 3)
        gamma = self.gamma
        for idx, idx2 in enumerate(self.c_order):
            buf[idx2::3] = [gamma[v] for v in data[idx::3]]
        return buf

import random
random.seed(0)
N = 1000
t = Test(N)
data = [random.randint(0, 255) for i in range(3*N)]
r1 = t.do1(data)
r2 = t.do2(data)
print r1 == r2  # check the result

%timeit t.do1(data)
%timeit t.do2(data)

the output, it's 6x faster: 输出,速度提高了6倍:

True
1000 loops, best of 3: 1.1 ms per loop
10000 loops, best of 3: 176 µs per loop

Contrary to popular belief, calling a map function will not give you significant speedup. 与流行的看法相反,调用map函数不会显着提高速度。 You may actually see worse performance. 您实际上可能会看到较差的性能。

Depending on how long you spend in this section of code, this may be the perfect situation where simply porting this loop to C makes sense. 根据您在这段代码中花费的时间而定,这可能是一个完美的情况,只需将该循环移植到C即可。 See here . 看这里

Make sure that you're actually spending a lot of time in this for-loop, otherwise the overhead of calling your C code will outweigh any potential performance gains. 确保您实际上在此for循环中花费了很多时间,否则,调用C代码的开销将超过任何潜在的性能提升。

Read here for some potential alternatives if you decide to use to port this code to C: 如果您决定使用此代码将代码移植到C,请在此处阅读一些潜在的替代方法:

  1. ctypes vs C extension ctypes与C扩展
  2. Wrapping a C library in Python: C, Cython or ctypes? 在Python中包装C库:C,Cython或ctypes?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM