NumPy数组中负数和正数的岛数

Question

I have an array containing chunks of negative and chunks of positive elements. 我有一个包含负片和大块正元素的数组。 A much simplified example of it would be an array a looking like: array([-3, -2, -1, 1, 2, 3, 4, 5, 6, -5, -4]) 它的一个大大简化的例子是一个数组a看起来像： array([-3, -2, -1, 1, 2, 3, 4, 5, 6, -5, -4])

(a<0).sum() and (a>0).sum() give me the total number of negative and positive elements but how do I count these in order? (a<0).sum()和(a>0).sum()给出了负数和正数元素的总数，但我如何按顺序计算这些元素？ By this I mean I want to know that my array contains first 3 negative elements, 6 positive and 2 negative. 通过这个我的意思是我想要知道我的数组包含前3个负面元素，6个正面和2个负面。

This sounds like a topic that have been addressed somewhere, and there may be a duplicate out there, but I can't find one. 这听起来像是一个已在某处解决过的话题，并且可能存在重复，但我找不到一个。

A method is to use numpy.roll(a,1) in a loop over the whole array and count the number of elements of a given sign appearing in eg the first element of the array as it rolls, but it doesn't look much numpyic (or pythonic) nor very efficient to me. 一种方法是在整个数组的循环中使用numpy.roll(a,1) ，并计算在滚动时出现在例如数组的第一个元素中的给定符号的元素数量，但它看起来并不多numpyic（或pythonic）对我来说也不是很有效率。

Answer 1

Here's one vectorized approach - 这是一种矢量化方法 -

def pos_neg_counts(a):
    mask = a>0
    idx = np.flatnonzero(mask[1:] != mask[:-1])
    count = np.concatenate(( [idx[0]+1], idx[1:] - idx[:-1], [a.size-1-idx[-1]] ))
    if a[0]<0:
        return count[1::2], count[::2] # pos, neg counts
    else:
        return count[::2], count[1::2] # pos, neg counts

Sample runs - 样品运行 -

In [155]: a
Out[155]: array([-3, -2, -1,  1,  2,  3,  4,  5,  6, -5, -4])

In [156]: pos_neg_counts(a)
Out[156]: (array([6]), array([3, 2]))

In [157]: a[0] = 3

In [158]: a
Out[158]: array([ 3, -2, -1,  1,  2,  3,  4,  5,  6, -5, -4])

In [159]: pos_neg_counts(a)
Out[159]: (array([1, 6]), array([2, 2]))

In [160]: a[-1] = 7

In [161]: a
Out[161]: array([ 3, -2, -1,  1,  2,  3,  4,  5,  6, -5,  7])

In [162]: pos_neg_counts(a)
Out[162]: (array([1, 6, 1]), array([2, 1]))

Runtime test 运行时测试

Other approach(es) - 其他方法 -

# @Franz's soln        
def split_app(my_array):
    negative_index = my_array<0
    splits = np.split(negative_index, np.where(np.diff(negative_index))[0]+1)
    len_list = [len(i) for i in splits]
    return len_list

Timings on bigger dataset - 关于更大数据集的计时 -

In [20]: # Setup input array
    ...: reps = np.random.randint(3,10,(100000))
    ...: signs = np.ones(len(reps),dtype=int)
    ...: signs[::2] = -1
    ...: a = np.repeat(signs, reps)*np.random.randint(1,9,reps.sum())
    ...: 

In [21]: %timeit split_app(a)
10 loops, best of 3: 90.4 ms per loop

In [22]: %timeit pos_neg_counts(a)
100 loops, best of 3: 2.21 ms per loop

Answer 2

Just use 只是用

my_array = np.array([-3, -2, -1,  1,  2,  3,  4,  5,  6, -5, -4])
negative_index = my_array<0

and you'll get the indizes of the negative values. 你会得到负值的韵味。 After that you can split this array: 之后你可以拆分这个数组：

splits = np.split(negative_index, np.where(np.diff(negative_index))[0]+1)

and moreover calc the size of the inner arrays: 而且还计算内部数组的大小：

len_list = [len(i) for i in splits]
print(len_list)

And you'll get what you are looking for: 你会得到你想要的东西：

Out[1]: [3, 6, 2]

You just have to mention what your first element is. 你只需要提到你的第一个元素是什么。 Per definition in my code, a negative one. 我的代码中的每个定义都是负面的。

So just execute: 所以只需执行：

my_array = np.array([-3, -2, -1,  1,  2,  3,  4,  5,  6, -5, -4])
negative_index = my_array<0
splits = np.split(negative_index, np.where(np.diff(negative_index))[0]+1)
len_list = [len(i) for i in splits]
print(len_list)

Answer 3

My (rather simple and probably inefficient) solution would be: 我的（相当简单且可能效率低下）的解决方案是：

import numpy as np
arr = np.array([-3, -2, -1,  1,  2,  3,  4,  5,  6, -5, -4])
sgn = np.sign(arr[0])
res = []
cntr = 1 # counting the first one
for i in range(1, len(arr)):
 if np.sign(arr[i]) != sgn:
  res.append(cntr)
  cntr = 0
  sgn *= -1
 cntr += 1
res.append(cntr)
print res

NumPy数组中负数和正数的岛数

问题描述

3 个解决方案

解决方案1
2 已采纳 2017-06-06 08:36:22

解决方案2
1 2017-06-06 08:27:07

解决方案3
0 2017-06-06 08:51:28

NumPy数组中负数和正数的岛数

问题描述

3 个解决方案

解决方案1 2 已采纳 2017-06-06 08:36:22

解决方案2 1 2017-06-06 08:27:07

解决方案3 0 2017-06-06 08:51:28

解决方案1
2 已采纳 2017-06-06 08:36:22

解决方案2
1 2017-06-06 08:27:07

解决方案3
0 2017-06-06 08:51:28