[英]Count groups of consecutive 1s in pandas
I have a list of '1's and '0s' and I would like to calculate the number of groups of consecutive '1's.我有一个“1”和“0”的列表,我想计算连续“1”的组数。
mylist = [0,0,1,1,0,1,1,1,1,0,1,0]
Doing it by hand gives us 3 groups but is there a way to do it by python?手工做给我们 3 组,但有没有办法通过 python 来做?
Here I count whenever there is a jump from 0 to 1. Prepending the 0 prevents not counting a leading sequence.每当有从 0 到 1 的跳跃时,我都会在这里计数。在 0 之前添加可防止不计算前导序列。
import numpy as np
mylist_arr = np.array([0] + [0,0,1,1,0,1,1,1,1,0,1,0])
diff = np.diff(mylist_arr)
count = np.sum(diff == 1)
you can try this你可以试试这个
import numpy as np
import pandas as pd
df=pd.DataFrame(data = [0,0,1,1,0,1,1,1,1,0,1,0])
df['Gid']=df[0].diff().eq(1).cumsum()
df=df[df[0].eq(1)]
df.groupby('Gid').size()
Out[245]:
Gid
1 2
2 4
3 1
dtype: int64
sum(df.groupby('Gid').size())/len(df.groupby('Gid').size())
Out[244]: 2.3333333333333335
Here's one solution:这是一种解决方案:
durations = []
for n, d in enumerate(mylist):
if (n == 0 and d == 1) or (n > 0 and mylist[n-1] == 0 and d == 1):
durations.append(1)
elif d == 1:
durations[-1] += 1
def mean(x):
return sum(x)/len(x)
print(durations)
print(mean(durations))
Option 1选项 1
With pandas
.与
pandas
。 First, initialise a dataframe:首先,初始化一个数据框:
In [78]: df
Out[78]:
Col1
0 0
1 0
2 1
3 1
4 0
5 1
6 1
7 1
8 1
9 0
10 1
11 0
Now calculate sum total by number of groups:现在按组数计算总和:
In [79]: df.sum() / df.diff().eq(1).cumsum().max()
Out[79]:
Col1 2.333333
dtype: float64
If you want just the number of groups, df.diff().eq(1).cumsum().max()
is enough.如果你只想要组的数量,
df.diff().eq(1).cumsum().max()
就足够了。
Option 2选项 2
With itertools.groupby
:使用
itertools.groupby
:
In [88]: sum(array) / sum(1 if sum(g) else 0 for _, g in itertools.groupby(array))
Out[88]: 2.3333333333333335
If you want just the number of groups, sum(1 if sum(g) else 0 for _, g in itertools.groupby(array))
is enough.如果你只想要组的数量,
sum(1 if sum(g) else 0 for _, g in itertools.groupby(array))
就足够了。
You can try this:你可以试试这个:
mylist = [0,0,1,1,0,1,1,1,1,0,1,0]
previous = mylist[0]
count = 0
for i in mylist[1:]:
if i == 1:
if previous == 0:
previous = 1
else:
if i == 0:
if previous == 1:
count += 1
previous = 0
print count
Output:输出:
3
Take a look at itertools.groupby
:看看
itertools.groupby
:
import itertools
import operator
def get_1_groups(ls):
return sum(map(operator.itemgetter(0), itertools.groupby(ls)))
This works because itertools.groupby
returns (the iterable equivalent) of:这是有效的,因为
itertools.groupby
返回(可迭代的等价物):
itertools.groupby([0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0])
# ==>
[(0, [0, 0]), (1, [1, 1]), (0, [0]), (1, [1, 1, 1, 1]), (0, [0]), (1, [1]), (0, [0])]
So you are just summing the first item.所以你只是总结了第一项。
If you can have other items that are not 0, they would add to the sum.如果您可以有其他不为 0 的项目,则它们会添加到总和中。
You can do something like this:你可以这样做:
def count_groups(ls, target=1):
return sum(target == value for value, _ in itertools.groupby(ls))
This can be accomplished without much work by simply summing the number of times the list transitions from 0
to 1
(Counting rising signal edges ):这可以通过简单地将列表从
0
转换为1
的次数相加(计数上升信号边缘)来完成,无需太多工作:
count = 0
last = 0
for element in mylist:
if element != last:
last = element
if element: # 1 is truthy
count += 1
print count
Here is my solution:这是我的解决方案:
c=[1,0,1,1,1,0]
max=0
counter = 0
for j in c:
if j==1:
counter+=1
else:
if counter>max:
max=counter
counter=0
continue
if counter>max:
max=counter
print(max)
A Quick and dirty one-liner (almost)一个快速而肮脏的单线(几乎)
import re
mylist = [0,0,1,1,0,1,1,1,1,0,1,0]
print len(re.sub(r'0+', '0', ''.join(str(x) for x in mylist)).strip('0').split('0'))
3
step by step:一步一步:
import re
mylist = [0,0,1,1,0,1,1,1,1,0,1,0]
sal1 = ''.join(str(x) for x in mylist) # returns a string from the list
sal2 = re.sub(r'0+', '0', sal1) # remove duplicates of zeroes
sal3 = sal2.strip('0') # remove 0s from the start & the end of the string
sal4 = len(sal3.split('0')) # split the string using '0' as separators into a list, and calculate it's length
This throws:这抛出:
sal -> 001101111010
sal2 -> 01101111010
sal3 -> 110111101
sal4 -> 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.