[英]Python groupby agg multiple columns according to one column
I have this df and I want to groupby value_to_groupby and get all rest of the columns according to minimal value_to_agg_upon我有这个 df,我想分组 value_to_groupby并根据最小值value_to_agg_upon获取所有其余的列
example:例子:
value_to_groupby value_to_agg_upon value_to_copy
0 1 "0" "a"
1 1 "1" "b"
2 1 "2" "c"
3 2 "5" "d"
4 2 "4" "e"
wanted result:想要的结果:
value_to_agg_upon value_to_copy
1 "0" "a"
2 "4" "e"
trying:试:
df.groupby("value_to_groupby").agg({"value_to_agg_upon": min})
gives this:给出了这个:
value_to_agg_upon
1 "0"
2 "4"
Try:尝试:
df.iloc[df.groupby("value_to_groupby")["value_to_agg_upon"].idxmin(), 1:].reset_index(drop=True)
value_to_agg_upon value_to_copy
0 0 "a"
1 4 "e"
df.iloc[df.groupby("value_to_groupby")["value_to_agg_upon"].idxmin(), :].set_index('value_to_groupby')
value_to_agg_upon value_to_copy
value_to_groupby
1 0 "a"
2 4 "e"
add this (convert string containing number to int) :添加此(将包含数字的字符串转换为 int):
import ast
df['value_to_agg_upon'] = df['value_to_agg_upon'].apply(ast.literal_eval).astype(int)
OR或者
df['value_to_agg_upon'] = pd.to_numeric(df['value_to_agg_upon'], errors='ignore')
Note: apply make the code slow.注意: apply 会使代码变慢。
Try this尝试这个
pd.merge(df,df.groupby('value_to_groupby').min('value_to_agg_upon'),on=['value_to_groupby','value_to_agg_upon'])
Output输出
value_to_groupby value_to_agg_upon value_to_copy
0 1 0 a
1 2 4 e
** I previously posted a Wrong answer.** ** 我之前发布了一个错误的答案。**
Since the data is object/string type.由于数据是对象/字符串类型。
one smart thing to do is (using the answer provided by pygirl) :一件聪明的事情是(使用 pygirl 提供的答案):
df = pd.DataFrame({"value_to_groupby":[1,1,1,2,2],
"value_to_agg_upon":["0","1","2","5","4"],
"value_to_copy":["a","b","c","d","e"]})
df_numeric = df.copy()
df_numeric["value_to_agg_upon"] = pd.to_numeric(df_numeric["value_to_agg_upon"])
df.iloc[df_numeric.groupby("value_to_groupby")["value_to_agg_upon"].idxmin(), 1:].reset_index(drop=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.