简体   繁体   中英

Counting Values from Different Pandas Dataframes with Certain Condition

I have datasets similar to this:

df1

company date act_call act_visit po
A 2022-10-01 Yes No No
B 2022-10-01 Yes No Yes
C 2022-10-01 No No No
B 2022-10-02 No Yes No
A 2022-10-02 No Yes No

df2

company date act_call act_visit po
D 2022-11-01 Yes No No
B 2022-11-01 Yes No Yes
C 2022-11-01 Yes Yes No
D 2022-11-02 No Yes No
A 2022-11-02 No Yes Yes

I want to count the number of company where the po is 'No' in df1 but also exists in df2 .

I tried using this code:

int_df = len(set(df2['company']).intersection(df1['po'].eq('no').groupby(df1['company'])))

but it returns below error:

unhashable type: 'Series'

My expected output:

2, (A, C)

*notes: the (A, C) doesn't have to be printed since I actually only want the number of the company.

What would be the best code to my expected output? Thank u in advance!

I would filter first the companies based on df2 with isin , then aggregate with groupy.all to identify the company with only "No", and sum :

(df1.loc[df1['company'].isin(df2['company']), 'po']
    .eq('No')
    .groupby(df1['company']).all()
    .sum()
)

Output: 2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM