[英]How to Groupby and calculate max based on filter in pandas dataframe
So, i have a dataframe like this:所以,我有一个这样的 dataframe:
I want to groupby based on Field1, if count is greater than 2, then find max Field2 and create a new field and set it to True.我想基于 Field1 进行分组,如果 count 大于 2,则找到 max Field2 并创建一个新字段并将其设置为 True。
I tried我试过
import pandas as pd
pd.read_csv("c:/test.csv")
df["Field3"] = df.groupby(["Field1"])["Field2"].transform("max")
But it didn't work.但它没有用。
We have to do an additional transform
to check the counts per Field1
which are greater than 2
我们必须做一个额外的transform
来检查每个Field1
的计数是否大于2
g = df.groupby("Field1")["Field2"]
df['Field3'] = g.transform('count').gt(2) & df['Field2'].eq(g.transform('max'))
Alternatively you can also use the single transform
with lambda function to check for the conditions but this might be slower that the first approach on larger dataframes或者,您也可以使用 lambda function 的单一transform
来检查条件,但这可能比第一种方法在较大数据帧上慢
df['Field3'] = df.groupby("Field1")["Field2"].transform(
lambda s: (s == max(s)) * (len(s) > 2))
Field1 Field2 Field3
0 a 3 False
1 a 5 True
2 a 3 False
3 b 2 False
4 c 1 False
5 b 6 False
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.