![](/img/trans.png)
[英]Create column based on row data when column doesn't exist or column is NaN in pandas
[英]Pandas group by find minimum of column if it doesn't exist return NaN
假設我有以下 dataframe:
import pandas as pd
df = pd.DataFrame({'id': [1,1,1,2,3,2], 'year': ['2020', '2014', '2002', '2020', '2016', '2014'], 'e': [True, False, True, True, False, True]})
df.info()
id year e
1 2020 True
1 2014 False
1 2002 True
2 2020 True
3 2016 False
2 2014 True
我想找到e為 True 的每個 id 的最小年份,如果該 id 返回 NaN 的e中沒有任何 True。 最終結果將是:
id year
1 2002
2 2014
3 NaN
在groupby
之前嘗試過濾並reindex
回來
s = df.loc[df.e].groupby('id').year.min().reindex(df.id.unique()).reset_index()
s
Out[307]:
id year
0 1 2002
1 2 2014
2 3 NaN
或轉換為Categorical
df['id'] = pd.Categorical(df['id'])
df.loc[df.e].groupby('id').year.min()
Out[309]:
id
1 2002
2 2014
3 None
Name: year, dtype: object
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.