简体   繁体   English


[英]Count the number of occurrences between markers in a python list

I have a boolean (numpy) array. 我有一个布尔(numpy)数组。 And I want to count how many occurrences of 'True' are between the Falses. 而且我想知道Falses之间出现了多少次'True'。

Eg for a sample list: 例如,样本列表:

b_List = [T,T,T,F,F,F,F,T,T,T,F,F,T,F] 

should produce 应该产生

ml = [3,3,1]

my initial attempt was to try this snippet: 我最初的尝试是尝试这个片段:

i = 0
ml = []
for el in b_List:
  if (b_List):
    i += 1
  i = 0

But it keeps appending elements in ml for each F in the b_List. 但是它会在b_List中为每个F添加以ml为单位的元素。


Thank you all for your answers. 谢谢大家的答案。 Sadly I can' accept all the answers as correct. 可悲的是,我可以'接受所有答案都是正确的。 I've accepted Akavall's answer because he referred to my initial attempt (I know what I did wrong now) and also did a comparison between the Mark's and Ashwinis posts. 我接受了Akavall的答案,因为他提到了我最初的尝试(我知道我现在做错了什么),并且还对Mark和Ashwinis的帖子进行了比较。

Please don't take as a define answer the accepted solution, since both the other suggestions introduce alternative methods what work equally well 请不要将接受的解决方案作为定义答案,因为其他建议都引入了同样有效的替代方法

itertools.groupby provides one easy way to do this: itertools.groupby提供了一种简单的方法:

>>> import itertools
>>> T, F = True, False
>>> b_List = [T,T,T,F,F,F,F,T,T,T,F,F,T,F]
>>> [len(list(group)) for value, group in itertools.groupby(b_List) if value]
[3, 3, 1]

Using NumPy : 使用NumPy

>>> import numpy as np
>>> a = np.array([ True,  True,  True, False, False, False, False,  True,  True, True, False, False,  True, False], dtype=bool)
>>> np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
array([3, 3, 1])

>>> a = np.array([True, False, False, True, True, False, False, True, False])
>>> np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
array([1, 2, 1])

Can't say that this is the best NumPy solution, but it is still faster than itertools.groupby : 不能说这是最好的NumPy解决方案,但它仍然比itertools.groupby更快:

>>> lis = [ True,  True,  True, False, False, False, False,  True,  True, True, False, False,  True, False]*1000
>>> a = np.array(lis)
>>> %timeit [len(list(group)) for value, group in groupby(lis) if value]
100 loops, best of 3: 9.58 ms per loop
>>> %timeit np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
1000 loops, best of 3: 1.4 ms per loop

>>> lis = [ True,  True,  True, False, False, False, False,  True,  True, True, False, False,  True, False]*10000
>>> a = np.array(lis)
>>> %timeit [len(list(group)) for value, group in groupby(lis) if value]
1 loops, best of 3: 95.5 ms per loop
>>> %timeit np.diff(np.insert(np.where(np.diff(a)==1)[0]+1, 0, 0))[::2]
100 loops, best of 3: 14.9 ms per loop

As @justhalf and @Mark Dickinson pointed out in comments the above code will not work in some cases, so you need to append False on both ends first: 正如@justhalf和@Mark Dickinson在评论中指出的那样,上面的代码在某些情况下不起作用,所以你需要先在两端附加False

In [28]: a                                                                                        
array([ True,  True,  True, False, False, False, False,  True,  True,
        True, False, False,  True, False], dtype=bool)

In [29]: np.diff(np.where(np.diff(np.hstack([False, a, False])))[0])[::2]
Out[29]: array([3, 3, 1])

Your original try has some problems: 您的原始尝试有一些问题:

i = 0
ml = []
for el in b_List:
    if (b_List): # b_list is a list and will evaluate to True
                 # unless you have an empty list, you want if (el)
        i += 1
    ml.append(i) # even if the above line was correct you still get here
                 # on every iteration, and you don't want that
    i = 0

You probably want something like this: 你可能想要这样的东西:

def count_Trues(b_list):
    i = 0
    ml = []
    prev = False
    for el in b_list:
        if el:
            i += 1
            prev = el
            if prev is not el:
                i = 0
            prev = el
    if el:
    return m

Result: 结果:

>>> T, F = True, False
>>> b_List = [T,T,T,F,F,F,F,T,T,T,F,F,T,F] 
>>> count_Trues(b_List)
[3, 3, 1]
>>> b_List.extend([T,T])
>>> count_Trues(b_List)
[3, 3, 1, 2]
>>> b_List.extend([F])
>>> count_Trues(b_List)
[3, 3, 1, 2]

This solution runs surprisingly fast: 此解决方案运行速度惊人:

In [5]: T, F = True, False

In [6]: b_List = [T,T,T,F,F,F,F,T,T,T,F,F,T,F] 

In [7]: new_b_List = b_List * 100

In [8]: import numpy as np

# Ashwini Chaudhary's Solution
In [9]: %timeit np.diff(np.insert(np.where(np.diff(new_b_List)==1)[0]+1, 0, 0))[::2]
1000 loops, best of 3: 299 us per loop

In [11]: %timeit count_Trues(new_b_List)
1000 loops, best of 3: 130 us per loop

In [12]: new_b_List = b_List * 1000

# Ashwini Chaudhary's Solution 
In [13]: %timeit np.diff(np.insert(np.where(np.diff(new_b_List)==1)[0]+1, 0, 0))[::2]
100 loops, best of 3: 2.25 ms per loop

In [14]: %timeit count_Trues(new_b_List)
100 loops, best of 3: 1.33 ms per loop

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM