简体   繁体   English

提取其值大于阈值的numpy数组的子数组

[英]Extract subarrays of numpy array whose values are above a threshold

I have a sound signal, imported as a numpy array and I want to cut it into chunks of numpy arrays. 我有一个声音信号,导入为一个numpy数组,我想把它切成块状的numpy数组。 However, I want the chunks to contain only elements above a threshold. 但是,我希望块只包含超过阈值的元素。 For example: 例如:

threshold = 3
signal = [1,2,6,7,8,1,1,2,5,6,7]

should output two arrays 应该输出两个数组

vec1 = [6,7,8]
vec2 = [5,6,7]

Ok, the above are lists, but you get my point. 好的,以上是列表,但你明白我的观点。

Here is what I tried so far, but this just kills my RAM 这是我到目前为止所尝试的,但这只会杀死我的RAM

def slice_raw_audio(audio_signal, threshold=5000):

    signal_slice, chunks = [], []

    for idx in range(0, audio_signal.shape[0], 1000):
        while audio_signal[idx] > threshold:
            signal_slice.append(audio_signal[idx])
         chunks.append(signal_slice)
    return chunks

Here's one approach - 这是一种方法 -

def split_above_threshold(signal, threshold):
    mask = np.concatenate(([False], signal > threshold, [False] ))
    idx = np.flatnonzero(mask[1:] != mask[:-1])
    return [signal[idx[i]:idx[i+1]] for i in range(0,len(idx),2)]

Sample run - 样品运行 -

In [48]: threshold = 3
    ...: signal = np.array([1,1,7,1,2,6,7,8,1,1,2,5,6,7,2,8,7,2])
    ...: 

In [49]: split_above_threshold(signal, threshold)
Out[49]: [array([7]), array([6, 7, 8]), array([5, 6, 7]), array([8, 7])]

Runtime test 运行时测试

Other approaches - 其他方法 -

# @Psidom's soln
def arange_diff(signal, threshold):
    above_th = signal > threshold
    index, values = np.arange(signal.size)[above_th], signal[above_th]
    return np.split(values, np.where(np.diff(index) > 1)[0]+1)

# @Kasramvd's soln   
def split_diff_step(signal, threshold):   
    return np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[1::2]

Timings - 计时 -

In [67]: signal = np.random.randint(0,9,(100000))

In [68]: threshold = 3

# @Kasramvd's soln 
In [69]: %timeit split_diff_step(signal, threshold)
10 loops, best of 3: 39.8 ms per loop

# @Psidom's soln
In [70]: %timeit arange_diff(signal, threshold)
10 loops, best of 3: 20.5 ms per loop

In [71]: %timeit split_above_threshold(signal, threshold)
100 loops, best of 3: 8.22 ms per loop

Here is a Numpythonic approach: 这是一个Numpythonic方法:

In [115]: np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)
Out[115]: [array([1, 2]), array([6, 7, 8]), array([1, 1, 2]), array([5, 6, 7])]

Note that this will give you all the lower and upper items which based on the logic of splitting (which is based on diff and continues items) they are always interleaves, which means that you can simply separate them by indexing: 请注意,这将为您提供基于拆分逻辑(基于diff和continue项)的所有较低和较高项目,它们始终是交错的,这意味着您可以通过索引简单地分隔它们:

In [121]: signal = np.array([1,2,6,7,8,1,1,2,5,6,7])

In [122]: np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[::2]
Out[122]: [array([1, 2]), array([1, 1, 2])]

In [123]: np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[1::2]
Out[123]: [array([6, 7, 8]), array([5, 6, 7])]

You can use the comparison of the first item of your list with the threshold in order to find out which one of the above slices would give you the upper items. 您可以使用列表中第一项与threshold的比较,以找出上述哪一个切片会为您提供上面的项目。

Generally you can use the following snippet to get the upper items: 通常,您可以使用以下代码段来获取上面的项目:

np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[signal[0] < threshold::2]

Here is one option: 这是一个选项:

above_th = signal > threshold
index, values = np.arange(signal.size)[above_th], signal[above_th]
np.split(values, np.where(np.diff(index) > 1)[0]+1)
# [array([6, 7, 8]), array([5, 6, 7])]

Wrap in a function: 包裹功能:

def above_thresholds(signal, threshold):
    above_th = signal > threshold
    index, values = np.arange(signal.size)[above_th], signal[above_th]
    return np.split(values, np.where(np.diff(index) > 1)[0]+1)

above_thresholds(signal, threshold)
# [array([6, 7, 8]), array([5, 6, 7])]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM