[英]get max and min values based on conditions in pandas dataframe
I have a dataframe like this我有一个像这样的 dataframe
count![]() |
A![]() |
B![]() |
Total![]() |
---|---|---|---|
yes![]() |
4900 ![]() |
0 ![]() |
0 ![]() |
yes![]() |
1000 ![]() |
1000 ![]() |
0 ![]() |
sum_yes ![]() |
5900 ![]() |
1000 ![]() |
0 ![]() |
yes![]() |
4000 ![]() |
0 ![]() |
0 ![]() |
yes![]() |
1000 ![]() |
0 ![]() |
0 ![]() |
sum_yes ![]() |
5000 ![]() |
0 ![]() |
0 ![]() |
I want result like this that is calculate max of column A and B only for rows where 'count' = 'sum_yes' if value of B =0 otherwise calculate minimum我想要这样的结果,即仅针对“count”=“sum_yes”的行计算 A 列和 B 列的最大值,如果 B 的值 =0,否则计算最小值
count![]() |
A![]() |
B![]() |
Total![]() |
---|---|---|---|
yes![]() |
4900 ![]() |
0 ![]() |
0 ![]() |
yes![]() |
1000 ![]() |
1000 ![]() |
0 ![]() |
sum_yes ![]() |
5900 ![]() |
1000 ![]() |
1000 ![]() |
yes![]() |
4000 ![]() |
0 ![]() |
0 ![]() |
yes![]() |
1000 ![]() |
0 ![]() |
0 ![]() |
sum_yes ![]() |
5000 ![]() |
0 ![]() |
5000 ![]() |
I have tried this so far:到目前为止,我已经尝试过:
df['Total'] = [df[['A', 'B']].where(df['count'] == 'sum_yes').max(axis=0) if
'B'==0 else df[['A', 'B']]
.where(df['count'] == 'sum_yes').min(axis=0)]
But I am getting ValueError The truth value of a Series is ambiguous.但是我得到 ValueError 一个 Series 的真值是模棱两可的。 Use a.empty, a.bool(), a.item(), a.any() or a.all()
使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()
Any idea how to solve this知道如何解决这个问题
You can use numpy.where
:您可以使用
numpy.where
:
new_values = np.where((df["count"] == "sum_yes") & (df.B == 0),
df.loc[:, ["A", "B"]].max(1),
df.loc[:, ["A", "B"]].min(1),
)
df.assign(Total = new_values)
count A B Total
0 yes 4900 0 0
1 yes 1000 0 0
2 sum_yes 5900 1000 1000
3 yes 4000 1000 1000
4 yes 1000 0 0
5 sum_yes 5000 0 5000
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.