[英]Counting Values from Different Pandas Dataframes with Certain Condition
I have datasets similar to this:我有类似于这样的数据集:
df1
company公司 | date日期 | act_call act_call | act_visit行动访问 | po宝 |
---|---|---|---|---|
A一种 | 2022-10-01 2022-10-01 | Yes是的 | No不 | No不 |
B乙 | 2022-10-01 2022-10-01 | Yes是的 | No不 | Yes是的 |
C C | 2022-10-01 2022-10-01 | No不 | No不 | No不 |
B乙 | 2022-10-02 2022-10-02 | No不 | Yes是的 | No不 |
A一种 | 2022-10-02 2022-10-02 | No不 | Yes是的 | No不 |
df2
company公司 | date日期 | act_call act_call | act_visit行动访问 | po宝 |
---|---|---|---|---|
D丁 | 2022-11-01 2022-11-01 | Yes是的 | No不 | No不 |
B乙 | 2022-11-01 2022-11-01 | Yes是的 | No不 | Yes是的 |
C C | 2022-11-01 2022-11-01 | Yes是的 | Yes是的 | No不 |
D丁 | 2022-11-02 2022-11-02 | No不 | Yes是的 | No不 |
A一种 | 2022-11-02 2022-11-02 | No不 | Yes是的 | Yes是的 |
I want to count the number of company where the po
is 'No' in df1
but also exists in df2
.我想计算po
在df1
中为“否”但在df2
中也存在的公司数量。
I tried using this code:我尝试使用此代码:
int_df = len(set(df2['company']).intersection(df1['po'].eq('no').groupby(df1['company'])))
but it returns below error:但它返回以下错误:
unhashable type: 'Series'
My expected output:我预期的 output:
2, (A, C) 2、(甲、丙)
*notes: the (A, C) doesn't have to be printed since I actually only want the number of the company. *注意:不必打印 (A, C),因为我实际上只想要公司的编号。
What would be the best code to my expected output?什么是我预期的 output 的最佳代码? Thank u in advance!提前谢谢你!
I would filter first the companies based on df2
with isin
, then aggregate with groupy.all
to identify the company with only "No", and sum
:我会首先使用groupy.all
过滤基于df2
的公司,然后与isin
聚合以识别只有“否”的公司,然后sum
:
(df1.loc[df1['company'].isin(df2['company']), 'po']
.eq('No')
.groupby(df1['company']).all()
.sum()
)
Output: 2
Output: 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.