[英]Extract subarrays of numpy array whose values are above a threshold
I have a sound signal, imported as a numpy array and I want to cut it into chunks of numpy arrays. 我有一个声音信号,导入为一个numpy数组,我想把它切成块状的numpy数组。 However, I want the chunks to contain only elements above a threshold.
但是,我希望块只包含超过阈值的元素。 For example:
例如:
threshold = 3
signal = [1,2,6,7,8,1,1,2,5,6,7]
should output two arrays 应该输出两个数组
vec1 = [6,7,8]
vec2 = [5,6,7]
Ok, the above are lists, but you get my point. 好的,以上是列表,但你明白我的观点。
Here is what I tried so far, but this just kills my RAM 这是我到目前为止所尝试的,但这只会杀死我的RAM
def slice_raw_audio(audio_signal, threshold=5000):
signal_slice, chunks = [], []
for idx in range(0, audio_signal.shape[0], 1000):
while audio_signal[idx] > threshold:
signal_slice.append(audio_signal[idx])
chunks.append(signal_slice)
return chunks
Here's one approach - 这是一种方法 -
def split_above_threshold(signal, threshold):
mask = np.concatenate(([False], signal > threshold, [False] ))
idx = np.flatnonzero(mask[1:] != mask[:-1])
return [signal[idx[i]:idx[i+1]] for i in range(0,len(idx),2)]
Sample run - 样品运行 -
In [48]: threshold = 3
...: signal = np.array([1,1,7,1,2,6,7,8,1,1,2,5,6,7,2,8,7,2])
...:
In [49]: split_above_threshold(signal, threshold)
Out[49]: [array([7]), array([6, 7, 8]), array([5, 6, 7]), array([8, 7])]
Other approaches - 其他方法 -
# @Psidom's soln
def arange_diff(signal, threshold):
above_th = signal > threshold
index, values = np.arange(signal.size)[above_th], signal[above_th]
return np.split(values, np.where(np.diff(index) > 1)[0]+1)
# @Kasramvd's soln
def split_diff_step(signal, threshold):
return np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[1::2]
Timings - 计时 -
In [67]: signal = np.random.randint(0,9,(100000))
In [68]: threshold = 3
# @Kasramvd's soln
In [69]: %timeit split_diff_step(signal, threshold)
10 loops, best of 3: 39.8 ms per loop
# @Psidom's soln
In [70]: %timeit arange_diff(signal, threshold)
10 loops, best of 3: 20.5 ms per loop
In [71]: %timeit split_above_threshold(signal, threshold)
100 loops, best of 3: 8.22 ms per loop
Here is a Numpythonic approach: 这是一个Numpythonic方法:
In [115]: np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)
Out[115]: [array([1, 2]), array([6, 7, 8]), array([1, 1, 2]), array([5, 6, 7])]
Note that this will give you all the lower and upper items which based on the logic of splitting (which is based on diff
and continues items) they are always interleaves, which means that you can simply separate them by indexing: 请注意,这将为您提供基于拆分逻辑(基于
diff
和continue项)的所有较低和较高项目,它们始终是交错的,这意味着您可以通过索引简单地分隔它们:
In [121]: signal = np.array([1,2,6,7,8,1,1,2,5,6,7])
In [122]: np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[::2]
Out[122]: [array([1, 2]), array([1, 1, 2])]
In [123]: np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[1::2]
Out[123]: [array([6, 7, 8]), array([5, 6, 7])]
You can use the comparison of the first item of your list with the threshold
in order to find out which one of the above slices would give you the upper items. 您可以使用列表中第一项与
threshold
的比较,以找出上述哪一个切片会为您提供上面的项目。
Generally you can use the following snippet to get the upper items: 通常,您可以使用以下代码段来获取上面的项目:
np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[signal[0] < threshold::2]
Here is one option: 这是一个选项:
above_th = signal > threshold
index, values = np.arange(signal.size)[above_th], signal[above_th]
np.split(values, np.where(np.diff(index) > 1)[0]+1)
# [array([6, 7, 8]), array([5, 6, 7])]
Wrap in a function: 包裹功能:
def above_thresholds(signal, threshold):
above_th = signal > threshold
index, values = np.arange(signal.size)[above_th], signal[above_th]
return np.split(values, np.where(np.diff(index) > 1)[0]+1)
above_thresholds(signal, threshold)
# [array([6, 7, 8]), array([5, 6, 7])]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.