Counting Values from Different Pandas Dataframes with Certain Condition

Question

I have datasets similar to this:

df1

company	date	act_call	act_visit	po
A	2022-10-01	Yes	No	No
B	2022-10-01	Yes	No	Yes
C	2022-10-01	No	No	No
B	2022-10-02	No	Yes	No
A	2022-10-02	No	Yes	No

df2

company	date	act_call	act_visit	po
D	2022-11-01	Yes	No	No
B	2022-11-01	Yes	No	Yes
C	2022-11-01	Yes	Yes	No
D	2022-11-02	No	Yes	No
A	2022-11-02	No	Yes	Yes

I want to count the number of company where the po is 'No' in df1 but also exists in df2 .

I tried using this code:

int_df = len(set(df2['company']).intersection(df1['po'].eq('no').groupby(df1['company'])))

but it returns below error:

unhashable type: 'Series'

My expected output:

2, (A, C)

*notes: the (A, C) doesn't have to be printed since I actually only want the number of the company.

What would be the best code to my expected output? Thank u in advance!

Answer 1

I would filter first the companies based on df2 with isin , then aggregate with groupy.all to identify the company with only "No", and sum :

(df1.loc[df1['company'].isin(df2['company']), 'po']
    .eq('No')
    .groupby(df1['company']).all()
    .sum()
)

Output: 2

Counting Values from Different Pandas Dataframes with Certain Condition

Question

1 answers

solution1
0 2023-01-06 05:25:16

Counting Values from Different Pandas Dataframes with Certain Condition

Question

1 answers

solution1 0 2023-01-06 05:25:16

solution1
0 2023-01-06 05:25:16