[英]Python groupby agg multiple columns according to one column
我有这个 df,我想分组 value_to_groupby并根据最小值value_to_agg_upon获取所有其余的列
例子:
value_to_groupby value_to_agg_upon value_to_copy
0 1 "0" "a"
1 1 "1" "b"
2 1 "2" "c"
3 2 "5" "d"
4 2 "4" "e"
想要的结果:
value_to_agg_upon value_to_copy
1 "0" "a"
2 "4" "e"
试:
df.groupby("value_to_groupby").agg({"value_to_agg_upon": min})
给出了这个:
value_to_agg_upon
1 "0"
2 "4"
尝试:
df.iloc[df.groupby("value_to_groupby")["value_to_agg_upon"].idxmin(), 1:].reset_index(drop=True)
value_to_agg_upon value_to_copy
0 0 "a"
1 4 "e"
df.iloc[df.groupby("value_to_groupby")["value_to_agg_upon"].idxmin(), :].set_index('value_to_groupby')
value_to_agg_upon value_to_copy
value_to_groupby
1 0 "a"
2 4 "e"
添加此(将包含数字的字符串转换为 int):
import ast
df['value_to_agg_upon'] = df['value_to_agg_upon'].apply(ast.literal_eval).astype(int)
或者
df['value_to_agg_upon'] = pd.to_numeric(df['value_to_agg_upon'], errors='ignore')
注意: apply 会使代码变慢。
尝试这个
pd.merge(df,df.groupby('value_to_groupby').min('value_to_agg_upon'),on=['value_to_groupby','value_to_agg_upon'])
输出
value_to_groupby value_to_agg_upon value_to_copy
0 1 0 a
1 2 4 e
** 我之前发布了一个错误的答案。**
由于数据是对象/字符串类型。
一件聪明的事情是(使用 pygirl 提供的答案):
df = pd.DataFrame({"value_to_groupby":[1,1,1,2,2],
"value_to_agg_upon":["0","1","2","5","4"],
"value_to_copy":["a","b","c","d","e"]})
df_numeric = df.copy()
df_numeric["value_to_agg_upon"] = pd.to_numeric(df_numeric["value_to_agg_upon"])
df.iloc[df_numeric.groupby("value_to_groupby")["value_to_agg_upon"].idxmin(), 1:].reset_index(drop=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.