簡體   English   中英

python 拆分 dataframe 子集與從到條件

[英]python split dataframe subset with from to condition

我有 dataframe ,我必須在以下條件下拆分子集:

開始拆分:c = 1 結束拆分:c = -1

例子:

 a         b    c
False   False   
False   False  -1 
False   False   
True    False   1  start first subset
False   False   
False   False   1
False   False   1
False   False   
False   False   1
False   False   
False   False   1
False   False   
False   True   -1 end of first subset
False   False   
False   False   
False   True   -1
False   False   
False   False   
True    False   1 start second subset
False   False  -1 end of second subset

這可能是一個解決方案,盡管我不確定它是否是最有效的方法。 這基本上使用 cumsum 和一些和/或邏輯。

import pandas as pd
import numpy as np

df = pd.DataFrame({'c': [np.nan, 1, np.nan, 1, np.nan, np.nan,
                         -1, np.nan, np.nan, 1, np.nan, np.nan,
                         1, 1, np.nan, -1, -1, 1, -1]})

      c
0   NaN
1   1.0
2   NaN
3   1.0
4   NaN
5   NaN
6  -1.0
7   NaN
8   NaN
9   1.0
10  NaN
11  NaN
12  1.0
13  1.0
14  NaN
15 -1.0
16 -1.0
17  1.0
18 -1.0

(
    df
    .assign(
    start_end=lambda df: df.index.isin(
        df
        .loc[lambda df: df.c.isin([1,-1])]
        .loc[lambda df: df.c.shift(1,fill_value=0)!=df.c]
        .index),
    start=lambda df: np.where(np.logical_and(df.start_end==True,df.c==1),1,0),
    end=lambda df: np.where(np.logical_and(df.start_end==True,df.c==-1),1,0),
    subset=lambda df: np.where(df.start.cumsum() != df.end.shift(1, fill_value=0).cumsum(),
                               df.start.cumsum(),
                               0)
    )
    .drop(columns=['start_end','start','end'])
)

      c  subset
0   NaN       0
1   1.0       1
2   NaN       1
3   1.0       1
4   NaN       1
5   NaN       1
6  -1.0       1
7   NaN       0
8   NaN       0
9   1.0       2
10  NaN       2
11  NaN       2
12  1.0       2
13  1.0       2
14  NaN       2
15 -1.0       2
16 -1.0       0
17  1.0       3
18 -1.0       3
```

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM