[英]Count of islands of negative and positive numbers in a NumPy array
I have an array containing chunks of negative and chunks of positive elements. 我有一个包含负片和大块正元素的数组。 A much simplified example of it would be an array
a
looking like: array([-3, -2, -1, 1, 2, 3, 4, 5, 6, -5, -4])
它的一个大大简化的例子是一个数组
a
看起来像: array([-3, -2, -1, 1, 2, 3, 4, 5, 6, -5, -4])
(a<0).sum()
and (a>0).sum()
give me the total number of negative and positive elements but how do I count these in order? (a<0).sum()
和(a>0).sum()
给出了负数和正数元素的总数,但我如何按顺序计算这些元素? By this I mean I want to know that my array contains first 3 negative elements, 6 positive and 2 negative. 通过这个我的意思是我想要知道我的数组包含前3个负面元素,6个正面和2个负面。
This sounds like a topic that have been addressed somewhere, and there may be a duplicate out there, but I can't find one. 这听起来像是一个已在某处解决过的话题,并且可能存在重复,但我找不到一个。
A method is to use numpy.roll(a,1)
in a loop over the whole array and count the number of elements of a given sign appearing in eg the first element of the array as it rolls, but it doesn't look much numpyic (or pythonic) nor very efficient to me. 一种方法是在整个数组的循环中使用
numpy.roll(a,1)
,并计算在滚动时出现在例如数组的第一个元素中的给定符号的元素数量,但它看起来并不多numpyic(或pythonic)对我来说也不是很有效率。
Here's one vectorized approach - 这是一种矢量化方法 -
def pos_neg_counts(a):
mask = a>0
idx = np.flatnonzero(mask[1:] != mask[:-1])
count = np.concatenate(( [idx[0]+1], idx[1:] - idx[:-1], [a.size-1-idx[-1]] ))
if a[0]<0:
return count[1::2], count[::2] # pos, neg counts
else:
return count[::2], count[1::2] # pos, neg counts
Sample runs - 样品运行 -
In [155]: a
Out[155]: array([-3, -2, -1, 1, 2, 3, 4, 5, 6, -5, -4])
In [156]: pos_neg_counts(a)
Out[156]: (array([6]), array([3, 2]))
In [157]: a[0] = 3
In [158]: a
Out[158]: array([ 3, -2, -1, 1, 2, 3, 4, 5, 6, -5, -4])
In [159]: pos_neg_counts(a)
Out[159]: (array([1, 6]), array([2, 2]))
In [160]: a[-1] = 7
In [161]: a
Out[161]: array([ 3, -2, -1, 1, 2, 3, 4, 5, 6, -5, 7])
In [162]: pos_neg_counts(a)
Out[162]: (array([1, 6, 1]), array([2, 1]))
Runtime test 运行时测试
Other approach(es) - 其他方法 -
# @Franz's soln
def split_app(my_array):
negative_index = my_array<0
splits = np.split(negative_index, np.where(np.diff(negative_index))[0]+1)
len_list = [len(i) for i in splits]
return len_list
Timings on bigger dataset - 关于更大数据集的计时 -
In [20]: # Setup input array
...: reps = np.random.randint(3,10,(100000))
...: signs = np.ones(len(reps),dtype=int)
...: signs[::2] = -1
...: a = np.repeat(signs, reps)*np.random.randint(1,9,reps.sum())
...:
In [21]: %timeit split_app(a)
10 loops, best of 3: 90.4 ms per loop
In [22]: %timeit pos_neg_counts(a)
100 loops, best of 3: 2.21 ms per loop
Just use 只是用
my_array = np.array([-3, -2, -1, 1, 2, 3, 4, 5, 6, -5, -4])
negative_index = my_array<0
and you'll get the indizes of the negative values. 你会得到负值的韵味。 After that you can split this array:
之后你可以拆分这个数组:
splits = np.split(negative_index, np.where(np.diff(negative_index))[0]+1)
and moreover calc the size of the inner arrays: 而且还计算内部数组的大小:
len_list = [len(i) for i in splits]
print(len_list)
And you'll get what you are looking for: 你会得到你想要的东西:
Out[1]: [3, 6, 2]
You just have to mention what your first element is. 你只需要提到你的第一个元素是什么。 Per definition in my code, a negative one.
我的代码中的每个定义都是负面的。
So just execute: 所以只需执行:
my_array = np.array([-3, -2, -1, 1, 2, 3, 4, 5, 6, -5, -4])
negative_index = my_array<0
splits = np.split(negative_index, np.where(np.diff(negative_index))[0]+1)
len_list = [len(i) for i in splits]
print(len_list)
My (rather simple and probably inefficient) solution would be: 我的(相当简单且可能效率低下)的解决方案是:
import numpy as np
arr = np.array([-3, -2, -1, 1, 2, 3, 4, 5, 6, -5, -4])
sgn = np.sign(arr[0])
res = []
cntr = 1 # counting the first one
for i in range(1, len(arr)):
if np.sign(arr[i]) != sgn:
res.append(cntr)
cntr = 0
sgn *= -1
cntr += 1
res.append(cntr)
print res
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.