[英]for loop to list comprehension or map in python
I'm trying to improve the speed of some python code a bit and therefore trying to move a standard for loop to either a list comprehension or map call: 我试图稍微提高一些python代码的速度,因此尝试将标准for循环移动到列表理解或map调用:
buf = [0 for i in range(self.numLEDs * 3)]
temp = [0,0,0]
for x in range(self.numLEDs):
r = data[x*3]
g = data[x*3+1]
b = data[x*3+2]
temp[self.c_order[0]] = self.gamma[r]
temp[self.c_order[1]] = self.gamma[g]
temp[self.c_order[2]] = self.gamma[b]
buf[x * 3:x * 3 + 3] = temp
c_order is simply another list, in this case [1,2,0]. c_order只是另一个列表,在这种情况下为[1,2,0]。 It controls the channel order for some RGB pixels. 它控制某些RGB像素的通道顺序。 gamma is a list 256 elements long that holds gamma corrected values for each of the 8bit channel values. gamma是一个256个元素长的列表,其中包含8位通道值中每个值的经过gamma校正的值。
What'd I'd like to do is somehow completely remove any use of a standard for loop from this bit of code. 我想做的就是以某种方式从这部分代码中完全删除对循环标准的使用。 I've managed to do it without the channel swap, but with the gamma correction and it's twice as fast. 我已经设法在没有通道交换的情况下做到了这一点,但是有了伽玛校正,它的速度快了一倍。 Like this: 像这样:
corrected = [gamma[i] for i in data]
buf[0:len(corrected)] = corrected
How can I swap the order of list elements as I go without a for loop though? 我如何在没有for循环的情况下交换列表元素的顺序?
You can have everything done in numpy
in a few lines and slightly faster: 您可以在几行中以numpy
完成所有工作,并且速度稍快一些:
In [69]:
gamma=list(np.random.rand(256))
numLEDs=10
data=list(np.random.randint(0,256,30))
c_order=[0,1,2]
In [70]:
%%timeit
buf = [0 for i in range(numLEDs * 3)]
temp = [0,0,0]
for x in range(numLEDs):
r = data[x*3]
g = data[x*3+1]
b = data[x*3+2]
temp[c_order[0]] = gamma[r]
temp[c_order[1]] = gamma[g]
temp[c_order[2]] = gamma[b]
buf[x * 3:x * 3 + 3] = temp
10000 loops, best of 3: 47.3 µs per loop
In [85]:
gamma=np.array(gamma)
data=np.array(data)
In [86]:
%%timeit
data_array=data.reshape(3, -1, order='F')
np.take(gamma[data_array], c_order, axis=0).ravel(order='F')
10000 loops, best of 3: 38.3 µs per loop
When you have a lot of LED's, the numpy
version will be much faster than the loop
version: 当您有很多LED时, numpy
版本将比loop
版本快得多:
In [98]:
gamma=list(np.random.rand(256))
numLEDs=1000
data=list(np.random.randint(0,256,3000))
c_order=[0,1,2]
In [99]:
%%timeit
buf = [0 for i in range(numLEDs * 3)]
temp = [0,0,0]
for x in range(numLEDs):
r = data[x*3]
g = data[x*3+1]
b = data[x*3+2]
temp[c_order[0]] = gamma[r]
temp[c_order[1]] = gamma[g]
temp[c_order[2]] = gamma[b]
buf[x * 3:x * 3 + 3] = temp
100 loops, best of 3: 4.08 ms per loop
In [100]:
gamma=np.array(gamma)
data=np.array(data)
In [101]:
%%timeit
data_array=data.reshape(3, -1, order='F')
np.take(gamma[data_array], c_order, axis=0).ravel(order='F')
1000 loops, best of 3: 244 µs per loop
So you need pure python code without any extension library. 因此,您需要没有任何扩展库的纯python代码。
To speedup the code: 要加速代码:
Here is the code: 这是代码:
class Test(object):
def __init__(self, n):
self.numLEDs = n
self.c_order = [1, 2, 0]
self.gamma = [i // 2 for i in range(256)]
def do1(self, data):
buf = [0 for i in range(self.numLEDs * 3)]
temp = [0,0,0]
for x in range(self.numLEDs):
r = data[x*3]
g = data[x*3+1]
b = data[x*3+2]
temp[self.c_order[0]] = self.gamma[r]
temp[self.c_order[1]] = self.gamma[g]
temp[self.c_order[2]] = self.gamma[b]
buf[x * 3:x * 3 + 3] = temp
return buf
def do2(self, data):
buf = [0] * (self.numLEDs * 3)
gamma = self.gamma
for idx, idx2 in enumerate(self.c_order):
buf[idx2::3] = [gamma[v] for v in data[idx::3]]
return buf
import random
random.seed(0)
N = 1000
t = Test(N)
data = [random.randint(0, 255) for i in range(3*N)]
r1 = t.do1(data)
r2 = t.do2(data)
print r1 == r2 # check the result
%timeit t.do1(data)
%timeit t.do2(data)
the output, it's 6x faster: 输出,速度提高了6倍:
True
1000 loops, best of 3: 1.1 ms per loop
10000 loops, best of 3: 176 µs per loop
Contrary to popular belief, calling a map
function will not give you significant speedup. 与流行的看法相反,调用map
函数不会显着提高速度。 You may actually see worse performance. 您实际上可能会看到较差的性能。
Depending on how long you spend in this section of code, this may be the perfect situation where simply porting this loop to C makes sense. 根据您在这段代码中花费的时间而定,这可能是一个完美的情况,只需将该循环移植到C即可。 See here . 看这里 。
Make sure that you're actually spending a lot of time in this for-loop, otherwise the overhead of calling your C code will outweigh any potential performance gains. 确保您实际上在此for循环中花费了很多时间,否则,调用C代码的开销将超过任何潜在的性能提升。
Read here for some potential alternatives if you decide to use to port this code to C: 如果您决定使用此代码将代码移植到C,请在此处阅读一些潜在的替代方法:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.