[英]Python groupby agg multiple columns according to one column
我有這個 df,我想分組 value_to_groupby並根據最小值value_to_agg_upon獲取所有其余的列
例子:
value_to_groupby value_to_agg_upon value_to_copy
0 1 "0" "a"
1 1 "1" "b"
2 1 "2" "c"
3 2 "5" "d"
4 2 "4" "e"
想要的結果:
value_to_agg_upon value_to_copy
1 "0" "a"
2 "4" "e"
試:
df.groupby("value_to_groupby").agg({"value_to_agg_upon": min})
給出了這個:
value_to_agg_upon
1 "0"
2 "4"
嘗試:
df.iloc[df.groupby("value_to_groupby")["value_to_agg_upon"].idxmin(), 1:].reset_index(drop=True)
value_to_agg_upon value_to_copy
0 0 "a"
1 4 "e"
df.iloc[df.groupby("value_to_groupby")["value_to_agg_upon"].idxmin(), :].set_index('value_to_groupby')
value_to_agg_upon value_to_copy
value_to_groupby
1 0 "a"
2 4 "e"
添加此(將包含數字的字符串轉換為 int):
import ast
df['value_to_agg_upon'] = df['value_to_agg_upon'].apply(ast.literal_eval).astype(int)
或者
df['value_to_agg_upon'] = pd.to_numeric(df['value_to_agg_upon'], errors='ignore')
注意: apply 會使代碼變慢。
嘗試這個
pd.merge(df,df.groupby('value_to_groupby').min('value_to_agg_upon'),on=['value_to_groupby','value_to_agg_upon'])
輸出
value_to_groupby value_to_agg_upon value_to_copy
0 1 0 a
1 2 4 e
** 我之前發布了一個錯誤的答案。**
由於數據是對象/字符串類型。
一件聰明的事情是(使用 pygirl 提供的答案):
df = pd.DataFrame({"value_to_groupby":[1,1,1,2,2],
"value_to_agg_upon":["0","1","2","5","4"],
"value_to_copy":["a","b","c","d","e"]})
df_numeric = df.copy()
df_numeric["value_to_agg_upon"] = pd.to_numeric(df_numeric["value_to_agg_upon"])
df.iloc[df_numeric.groupby("value_to_groupby")["value_to_agg_upon"].idxmin(), 1:].reset_index(drop=True)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.