[英]Python Pandas: create new column with min values based on unique identifiers in other columns
I have a dataframe:我有一个 dataframe:
pd.DataFrame({'person':['x', 'x', 'x', 'x', 'y', 'y', 'y', 'y'],
'bank':['chase', 'bod', 'chase', 'boa', 'chase', 'bod', 'chase', 'boa'],
'amount': [100, 80, 90, 60, 150, 111, 524, 51]})
Assuming there could be many people in the "person" column.假设“人”列中可能有很多人。 But there are only "chase" and "boa" in the "bank" column.但“银行”一栏中只有“追逐”和“蟒蛇”。 Every person will have both "chase" and "boa".每个人都会有“追逐”和“蟒蛇”。 For each person, I want to get the minimum number in the "amount" column for each bank.对于每个人,我想在每家银行的“金额”列中获得最小数量。 The output will be like this: output 将是这样的:
Each row should have the person, min amount in chase ("chase_min") and min amount in boa ("boa_min").每行应该有人,追逐中的最小数量(“chase_min”)和蟒蛇中的最小数量(“boa_min”)。
Thank you!谢谢!
You can use min() as the aggfunc for a pivot table.您可以使用 min() 作为 pivot 表的 aggfunc。
pd.pivot_table(df, index='person', columns=['bank'], values='amount', aggfunc='min')
I think the current answer is overkill, and this solution has the advantage of producing a convenient index:我认为当前的答案是矫枉过正,这个解决方案的优点是产生一个方便的索引:
import pandas as pd
df = pd.DataFrame({'person': ['x', 'x', 'x', 'x', 'y', 'y', 'y', 'y'],
'bank': ['chase', 'bod', 'chase', 'boa', 'chase', 'bod', 'chase', 'boa'],
'amount': [100, 80, 90, 60, 150, 111, 524, 51]})
res = df.groupby(["person", "bank"]).min()
print(f"{df}\n\n{res}")
Output: Output:
person bank amount
0 x chase 100
1 x bod 80
2 x chase 90
3 x boa 60
4 y chase 150
5 y bod 111
6 y chase 524
7 y boa 51
amount
person bank
x boa 60
bod 80
chase 90
y boa 51
bod 111
chase 150
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.