[英]Pandas groupby with lambda and in the list
I have following dataframe 我有以下数据框
df = pd.DataFrame({'ItemType': ['Red', 'White', 'Red', 'Blue', 'White', 'White', 'White', 'Green'],
'ItemPrice': [10, 11, 12, 13, 14, 15, 16, 17],
'ItemID': ['A', 'A', 'B', 'B', 'C', 'C', 'D', 'D']})
I would like get records (rows) with ItemIDs that contain only "White" ItemType in a form of a DataFrame 我想以DataFrame的形式获取具有仅包含“ White” ItemType的ItemID的记录(行)
I have attempted following solution: 我尝试了以下解决方案:
types = ['Red','Blue','Green']
~df.groupby('ItemID')['ItemType'].any().apply(lambda u: u in(types))
But this gives me an incorrect result (D should be False) and in a form of a series. 但这给了我一个不正确的结果(D应该为False)并且是一系列的结果。
A False
B False
C True
D True
Thank you! 谢谢!
You should avoid using apply
here, as it is usually quite slow. 您应该避免在此处使用
apply
,因为它通常很慢。 Instead, assign a flag
column before you groupby
, and then use all
to assert that none of a groups values are in types
: 相反,请在
groupby
之前分配一个flag
列,然后使用all
断言一个group值都不属于types
:
df.assign(flag=~df.ItemType.isin(types)).groupby('ItemID').flag.all()
ItemID
A False
B False
C True
D False
Name: flag, dtype: bool
However, just to demonstrate the logic of the operation, and show what was incorrect about your approach, here is a working version using apply
: 但是,只是为了演示操作的逻辑,并说明您的方法的不正确之处,以下是使用
apply
的工作版本:
~df.groupby('ItemID').ItemType.apply(lambda x: any(i in types for i in x))
You need to use any
inside the lambda, as opposed to on the Series before using apply
. 在使用
apply
之前,您需要在lambda 内部使用any
东西 ,而不是在Series上。
To access rows where this condition is met, you may use transform
: 要访问满足此条件的行,可以使用
transform
:
df[df.assign(flag=~df.ItemType.isin(types)).groupby('ItemID').flag.transform('all')]
ItemType ItemPrice ItemID
4 White 14 C
5 White 15 C
An alternative method is to calculate an array of non-white ItemID
values. 一种替代方法是计算非白色
ItemID
值的数组。 Then filter your dataframe: 然后过滤您的数据框:
non_whites = df.loc[df['ItemType'].ne('White'), 'ItemID'].unique()
res = df[~df['ItemID'].isin(non_whites)]
print(res)
ItemType ItemPrice ItemID
4 White 14 C
5 White 15 C
You can also use GroupBy
, but it's not absolutely necessary. 您也可以使用
GroupBy
,但这不是绝对必要的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.