簡體   English   中英

如何在pandas數據框中找到最長的連續值字符串

[英]How to find the longest consecutive string of values in pandas dataframe

我正在尋找在熊貓df中最長的零字符串。 我有一個包含10列的df數組,每列包含25000行,這些行具有null,零或非零數字。 我正在計算:

 1. A value which states the longest consecutive number 
        of zeros in each column for all the columns. 
 2. A value which states the longest consecutive number 
         of zeros AND nulls in each column for all the columns. 

例如,如果第一列是:

[col1:1,2,4,5,6,2,3,0,0,0,0,1,2,... (remaining all numbers)]

將返回4。

謝謝

設定

考慮數據幀df

df = pd.DataFrame(dict(
    col0=[1, 2, 3, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 9],
    col1=[1, 2, 3, 0, 0, 4, 0, 1, 2, 3, 4, 0, 0, 0, 1, 2, 0, 0, 2, 0, 4, 8, 9]
))

def max_zeros(c):
    v = c.values != 0
    d = np.diff(np.flatnonzero(np.diff(np.concatenate([[True], v]))))
    return d[::2].max()

df.apply(max_zeros)

col0    6
col1    3
dtype: int64

如果您有一個像

df = pd.DataFrame([[1, 2, 4, 5, 6, 2, 3, 0, 0, 0 ,0, 1, 2],[1, 0, 0, 2, 0, 2, 0, 0, 0, 0 ,0, 1, 2]])

您可以使用itertools groupby

from itertools import groupby
def get_conti(a):
    m = []
    for group in groupby(range(len(a)), lambda x: a[x]):
        if group[0]==0:
            zero=list(group[1])
            m.append(len(zero))
    return max(m)

df['max'] = df.apply(get_conti,1)

輸出:

0  1  2  3  4  5  6  7  8  9  10  11  12  max
0  1  2  4  5  6  2  3  0  0  0   0   1   2    4
1  1  0  0  2  0  2  0  0  0  0   0   1   2    5

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM