[英]How to make list Comprehensions faster?
Is there anything in general i could use to speed up nested list comprehensions ( sometimes conditional)?一般来说,我可以用什么来加速嵌套列表理解(有时是有条件的)?
I was thinking about numpy or transposing (but how?), but there might be anything ive overseen.我在考虑 numpy 或转置(但如何?),但我可能会监督任何事情。
Examples:例子:
nextInt = 1
winPoints = [[max(upSlice[:i]) for i in range(nextInt,len(upSlice)+1)] for upSlice in ppValues]```
or或者
winningRatio =[ [1 if ratioUp >ratioDown else 0 if (ratioDown>ratioUp) else
1 if (pointsUp>pointsDown) else 0 if (pointsDown>pointsUp) else 1
for ratioUp,ratioDown,pointsUp,pointsDown in zip(ratioUpSlice,ratioDownSlice,pointsUpSlice,pointsDownSlice)]
for ratioUpSlice,ratioDownSlice,pointsUpSlice,pointsDownSlice in zip(ratios_Up,ratios_Down, pointsUpSlices,pointsDownSlices)]
(all slices in the nested list do NOT have the same lenght) (嵌套列表中的所有切片都没有相同的长度)
I'm assuming ppValues
is a 2-dimensional list of numbers?我假设
ppValues
是一个二维数字列表?
Generally, to speed up algorithms you want to shave off any redundant calculation.通常,为了加快算法速度,您需要减少任何冗余计算。 For example, in your first piece of code, the
max(upSlice[:i])
part is executed (n*m) times and is executing approximately ((n^2)/2 * m) calculations where (n) is the length of the inner list and m is the length of the outer list.例如,在您的第一段代码中,
max(upSlice[:i])
部分被执行 (n*m) 次并且正在执行大约 ((n^2)/2 * m) 计算,其中 (n) 是内部列表的长度,m 是外部列表的长度。 And the execution is calculating the max of the same slice (except wIth 1 additional element) which is a lot of unnecessary comparisons.并且执行是计算同一切片的最大值(除了 1 个附加元素),这是很多不必要的比较。
To speed it up, we can use dynamic programming, where we build a slice_max
array such that the element at index i
is the max(upSlice[:i])
.为了加快速度,我们可以使用动态编程,我们构建一个
slice_max
数组,使得索引i
处的元素是max(upSlice[:i])
。 And we can do this efficiently by utilizing past computations since the element at index i
is equal to the max of two elements, max(slice_max[i-1], upSlice[i])
which is only a single computation.我们可以通过利用过去的计算有效地做到这一点,因为索引
i
处的元素等于两个元素的max(slice_max[i-1], upSlice[i])
,这只是一次计算。
Here's a simple implementation of the dynamic programming version:这是动态编程版本的简单实现:
def get_slice_max(arr, start):
result = [max(arr[:start])]
for i in range(start, len(arr)):
result.append(max(result[-1], arr[i]))
return result
winPoints = [get_slice_max(upSlice, nextInt) for upSlice in ppValues]
Here's a comparison:这是一个比较:
Generating Data生成数据
# generate fake data
np.random.seed(42)
# the variable length of inner lists: [7270, 860, 5390, ..., 1389, 4276, 1249]
inner_sizes = np.random.randint(low=1, high=1000, size=10000)
ppValues = [np.random.randint(1000, size=i) for i in inner_sizes]
# ppValues contains 10000 lists, each having 1 to 1000 elements, each element is a number between 1 to 10000
Your version你的版本
nextInt=1
winPoints = [[max(upSlice[:i]) for i in range(nextInt,len(upSlice)+1)] for upSlice in ppValues]
------------------
Wall time: 2min 36s
Simple Dynamic Programming Version简单动态规划版
def get_slice_max(arr, start):
result = [max(arr[:start])]
for i in range(start, len(arr)):
result.append(max(result[-1], arr[i]))
return result
winPoints = [get_slice_max(upSlice, nextInt) for upSlice in ppValues]
-----------------
Wall time: 1.79 s
You can see a major improvement from 2.5 minutes to <2 seconds.您可以看到从 2.5 分钟到 <2 秒的重大改进。
As for the second example, you did not provide enough context to try to tackle that problem.至于第二个示例,您没有提供足够的上下文来尝试解决该问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.