[英]numpy: split 1D array of chunks separated by nans into a list of the chunks
I have a numpy array with only some values being valid and the rest being nan. 我有一个numpy数组,只有一些值是有效的,其余的是nan。 example:
例:
[nan,nan, 1 , 2 , 3 , nan, nan, 10, 11 , nan, nan, nan, 23, 1, nan, 7, 8]
I would like to split it into a list of chunks containing every time the valid data. 我想将其拆分为每次包含有效数据的块列表。 The result would be
结果将是
[[1,2,3], [10,11], [23,1], [7,8]]
I managed to get it done by iterating over the array, checking isfinite() and producing (start,stop) indexes. 我设法通过迭代数组,检查isfinite()和生成(开始,停止)索引来完成它。
However... It is painfully slow... 但是......这很痛苦......
Do you perhaps have a better idea? 你或许有更好的主意吗?
Here is another possibility: 这是另一种可能性:
import numpy as np
nan = np.nan
def using_clump(a):
return [a[s] for s in np.ma.clump_unmasked(np.ma.masked_invalid(a))]
x = [nan,nan, 1 , 2 , 3 , nan, nan, 10, 11 , nan, nan, nan, 23, 1, nan, 7, 8]
In [56]: using_clump(x)
Out[56]:
[array([ 1., 2., 3.]),
array([ 10., 11.]),
array([ 23., 1.]),
array([ 7., 8.])]
Some benchmarks comparing using_clump and using_groupby: 比较using_clump和using_groupby的一些基准测试:
import itertools as IT
groupby = IT.groupby
def using_groupby(a):
return [list(v) for k,v in groupby(a,np.isfinite) if k]
In [58]: %timeit using_clump(x)
10000 loops, best of 3: 37.3 us per loop
In [59]: %timeit using_groupby(x)
10000 loops, best of 3: 53.1 us per loop
The performance is even better for larger arrays: 对于更大的阵列,性能甚至更好:
In [9]: x = x*1000
In [12]: %timeit using_clump(x)
100 loops, best of 3: 5.69 ms per loop
In [13]: %timeit using_groupby(x)
10 loops, best of 3: 60 ms per loop
I'd use itertools.groupby
-- It might be slightly faster: 我会使用
itertools.groupby
- 它可能会稍快一点:
from numpy import NaN as nan
import numpy as np
a = np.array([nan,nan, 1 , 2 , 3 , nan, nan, 10, 11 , nan, nan, nan, 23, 1, nan, 7, 8])
from itertools import groupby
result = [list(v) for k,v in groupby(a,np.isfinite) if k]
print result #[[1.0, 2.0, 3.0], [10.0, 11.0], [23.0, 1.0], [7.0, 8.0]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.