计算具有特定条件的不同 Pandas 数据帧的值

Question

I have datasets similar to this:我有类似于这样的数据集：

df1

company公司	date日期	act_call act_call	act_visit行动访问	po宝
A一种	2022-10-01 2022-10-01	Yes是的	No不	No不
B乙	2022-10-01 2022-10-01	Yes是的	No不	Yes是的
C C	2022-10-01 2022-10-01	No不	No不	No不
B乙	2022-10-02 2022-10-02	No不	Yes是的	No不
A一种	2022-10-02 2022-10-02	No不	Yes是的	No不

df2

company公司	date日期	act_call act_call	act_visit行动访问	po宝
D丁	2022-11-01 2022-11-01	Yes是的	No不	No不
B乙	2022-11-01 2022-11-01	Yes是的	No不	Yes是的
C C	2022-11-01 2022-11-01	Yes是的	Yes是的	No不
D丁	2022-11-02 2022-11-02	No不	Yes是的	No不
A一种	2022-11-02 2022-11-02	No不	Yes是的	Yes是的

I want to count the number of company where the po is 'No' in df1 but also exists in df2 .我想计算po在df1中为“否”但在df2中也存在的公司数量。

I tried using this code:我尝试使用此代码：

int_df = len(set(df2['company']).intersection(df1['po'].eq('no').groupby(df1['company'])))

but it returns below error:但它返回以下错误：

unhashable type: 'Series'

My expected output:我预期的 output：

2, (A, C) 2、（甲、丙）

*notes: the (A, C) doesn't have to be printed since I actually only want the number of the company. *注意：不必打印 (A, C)，因为我实际上只想要公司的编号。

What would be the best code to my expected output?什么是我预期的 output 的最佳代码？ Thank u in advance!提前谢谢你！

Answer 1

I would filter first the companies based on df2 with isin , then aggregate with groupy.all to identify the company with only "No", and sum :我会首先使用groupy.all过滤基于df2的公司，然后与isin聚合以识别只有“否”的公司，然后sum ：

(df1.loc[df1['company'].isin(df2['company']), 'po']
    .eq('No')
    .groupby(df1['company']).all()
    .sum()
)

Output: 2 Output： 2

计算具有特定条件的不同 Pandas 数据帧的值

问题描述

1 个解决方案

解决方案1
0 2023-01-06 05:25:16

计算具有特定条件的不同 Pandas 数据帧的值

问题描述

1 个解决方案

解决方案1 0 2023-01-06 05:25:16

解决方案1
0 2023-01-06 05:25:16