計算一系列數字中連續出現的次數

Question

我有一系列數字（一維數組），例如0、0、1、1、1、0、1、1、1、1 ...

有沒有一種優雅的方法（最好是最快的方法）在更改之前連續計數1或0的次數？ 因此，此結果將是（0，2），（1、3），（0、1），（1、4），...

Answer 1

這是NumPy的另一個功能，特別是利用數組切片-

def islands_info(a):
    # Compare consecutive elems for changes. Use `True` as sentients to detect edges
    idx = np.flatnonzero(np.r_[True,a[:-1]!=a[1:],True])

    # Index into input array with the sliced array until second last array to
    # get start indices and the differentiation for the lengths
    return np.column_stack((a[idx[:-1]],np.diff(idx)))

樣品運行-

In [51]: a = np.array([0, 0, 1, 1, 1, 0, 1, 1, 1, 1])

In [52]: islands_info(a)
Out[52]: 
array([[0, 2],
       [1, 3],
       [0, 1],
       [1, 4]])

如果您需要將輸出作為元組列表-

In [56]: list(zip(*islands_info(a).T))
Out[56]: [(0, 2), (1, 3), (0, 1), (1, 4)]

時間-

@yatu與另一個基於NumPy的比較-

In [43]: np.random.seed(a)

In [44]: a = np.random.choice([0,1], 1000000)

In [45]: %timeit yatu(a)
11.7 ms ± 428 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [46]: %timeit islands_info(a)
8.98 ms ± 40.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [47]: np.random.seed(a)

In [48]: a = np.random.choice([0,1], 10000000)

In [49]: %timeit yatu(a)
232 ms ± 3.71 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [50]: %timeit islands_info(a)
152 ms ± 933 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Answer 2

您可以從itertools使用groupby

from itertools import groupby
x = [1, 0, 0, 1, 0, 1, 1, 0, 0, 0]
occ = [(i, len([*y,])) for i,y in groupby(x)]

輸出：

In [23]: [(i, len([*y,])) for i,y in groupby(x)]
Out[23]: [(1, 1), (0, 2), (1, 1), (0, 1), (1, 2), (0, 3)]

Answer 3

這是一個表現出色的NumPy ：

a = np.array([0, 0, 1, 1, 1, 0, 1, 1, 1, 1])

# indexes where changes take place
changes = np.flatnonzero(np.diff(a)!=0)
#include initial and end index
ix = np.r_[0,changes+1,a.shape[0]]
# index the array with changes to check the value in question
# stack with the count of values, taking the diff over ix
np.column_stack([np.r_[a[changes], a[a.shape[0]-1]], np.diff(ix)])

array([[0, 2],
       [1, 3],
       [0, 1],
       [1, 4]], dtype=int64)

時間：

def yatu(a):
    changes = np.flatnonzero(np.diff(a)!=0)
    ix = np.r_[0,changes+1,a.shape[0]]
    return np.column_stack([np.r_[a[changes], a[a.shape[0]-1]], np.diff(ix)])

def groupby(a):
    return [(i, len([*y,])) for i,y in groupby(a)]

a = np.random.choice([0,1], 10_000)

%timeit groupby(list(a))
# 1.83 ms ± 168 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit yatu(a)
# 150 µs ± 14.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Answer 4

我使用的是來自functools reduce ，Ayoub所示的groupby可能更好（更快），因為該每次都復制累加器。

from functools import reduce

l = [0, 0, 1, 1, 1, 0, 1, 1, 1, 1]
p = lambda acc, x : acc[:-1] + [(x, acc[-1][1] + 1)] if acc and x == acc[-1][0] else acc + [(x, 1)]
result = reduce(p, l, [])

print(result)

[（0，2），（1，3），（0，1），（1，4）]

計算一系列數字中連續出現的次數

問題描述

4 個解決方案

解決方案1
2 2019-06-18 08:46:33

解決方案2
1 2019-06-18 07:30:28

解決方案3
1 2019-06-18 07:54:54

解決方案4
0 2019-06-18 07:37:59

計算一系列數字中連續出現的次數

問題描述

4 個解決方案

解決方案1 2 2019-06-18 08:46:33

解決方案2 1 2019-06-18 07:30:28

解決方案3 1 2019-06-18 07:54:54

解決方案4 0 2019-06-18 07:37:59

解決方案1
2 2019-06-18 08:46:33

解決方案2
1 2019-06-18 07:30:28

解決方案3
1 2019-06-18 07:54:54

解決方案4
0 2019-06-18 07:37:59