簡體   English   中英

計算熊貓系列中的值組

[英]Count groups of values in Pandas series

我有一個Pandas( pandas==0.23.4 )日期時間索引的數據value_id df ,其列名為value_id

value_id包含多組浮點值( 5.06.0 )和NaN組。 我想計算5.06.0的連續組數。 這些組必須包含至少三個連續的值。

例如:

In [1]: print df.value_id
timestamp
2019-01-06 17:42:08    NaN
2019-01-06 17:45:08    5.0
2019-01-06 17:48:08    5.0
2019-01-06 17:51:08    5.0
2019-01-06 17:54:08    NaN
2019-01-06 17:57:08    NaN
2019-01-06 18:00:08    NaN
2019-01-06 18:03:08    NaN
2019-01-06 18:06:08    NaN
2019-01-06 18:09:08    NaN
2019-01-06 18:12:08    6.0
2019-01-06 18:15:08    6.0
2019-01-06 19:54:09    NaN
2019-01-06 19:57:09    5.0
2019-01-06 20:00:08    5.0
2019-01-06 20:03:08    5.0
2019-01-06 20:06:09    NaN
2019-01-06 20:09:08    NaN
2019-01-06 20:12:08    NaN
2019-01-06 20:15:09    NaN
2019-01-06 20:18:08    NaN
2019-01-06 20:21:09    NaN
2019-01-06 20:24:09    NaN
2019-01-07 19:09:07    NaN
2019-01-07 19:12:06    NaN
2019-01-07 19:15:06    5.0
2019-01-07 19:18:06    5.0
2019-01-07 19:21:07    5.0
2019-01-07 19:24:07    5.0
2019-01-07 19:27:07    NaN
2019-01-07 19:30:07    NaN
2019-01-07 19:33:06    NaN
2019-01-07 19:36:07    NaN
2019-01-07 19:39:07    NaN
2019-01-07 19:42:06    NaN
2019-01-07 19:45:06    NaN
2019-01-07 19:48:06    NaN
2019-01-07 19:51:06    6.0
2019-01-07 19:54:07    6.0
2019-01-07 19:57:06    6.0
Name: value_id, dtype: float64

如果我有兩個名為count1 (用於5.0值組)和count2 (用於6.0值組)的變量,則為上述示例分配的結果計數為:

count1 :3

count2 :1

也許不是最優雅,但是您可以使用shift來檢查接下來的兩個項目是否具有相同的值,並且先前的值不是同一組的一部分:

df['fives'] = ((df['timestamp'] == 5) & (df['timestamp'].shift(-1) == 5)
                & (df['timestamp'].shift(-2) == 5)
                & (df['timestamp'].shift(1) != 5)).astype(int)
df['sixes'] = ((df['timestamp'] == 6) & (df['timestamp'].shift(-1) == 6)
                & (df['timestamp'].shift(-2) == 6)
                & (df['timestamp'].shift(1) != 6)).astype(int)

df[['fives','sixes']].sum()
fives    3
sixes    1
dtype: int64

IIUC創建組密鑰cumsum那么,我們只是做value_counts

s.groupby(s.isnull().cumsum()).value_counts().ge(3).sum(level=1)
Out[1026]: 
timestamp
5.0    3.0
6.0    1.0
Name: timestamp, dtype: float64

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM