[英]How to find the count of consecutive same string values in a pandas dataframe?
[英]How to find the longest consecutive string of values in pandas dataframe
我正在寻找在熊猫df中最长的零字符串。 我有一个包含10列的df数组,每列包含25000行,这些行具有null,零或非零数字。 我正在计算:
1. A value which states the longest consecutive number
of zeros in each column for all the columns.
2. A value which states the longest consecutive number
of zeros AND nulls in each column for all the columns.
例如,如果第一列是:
[col1:1,2,4,5,6,2,3,0,0,0,0,1,2,... (remaining all numbers)]
将返回4。
谢谢
设定
考虑数据帧df
df = pd.DataFrame(dict(
col0=[1, 2, 3, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 9],
col1=[1, 2, 3, 0, 0, 4, 0, 1, 2, 3, 4, 0, 0, 0, 1, 2, 0, 0, 2, 0, 4, 8, 9]
))
解
def max_zeros(c):
v = c.values != 0
d = np.diff(np.flatnonzero(np.diff(np.concatenate([[True], v]))))
return d[::2].max()
df.apply(max_zeros)
col0 6
col1 3
dtype: int64
如果您有一个像
df = pd.DataFrame([[1, 2, 4, 5, 6, 2, 3, 0, 0, 0 ,0, 1, 2],[1, 0, 0, 2, 0, 2, 0, 0, 0, 0 ,0, 1, 2]])
您可以使用itertools groupby
from itertools import groupby
def get_conti(a):
m = []
for group in groupby(range(len(a)), lambda x: a[x]):
if group[0]==0:
zero=list(group[1])
m.append(len(zero))
return max(m)
df['max'] = df.apply(get_conti,1)
输出:
0 1 2 3 4 5 6 7 8 9 10 11 12 max 0 1 2 4 5 6 2 3 0 0 0 0 1 2 4 1 1 0 0 2 0 2 0 0 0 0 0 1 2 5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.