简体   繁体   中英

Finding intersection between two Dataframes and taking mean - Python

I have two DFs.

df1

userID  time_taken  Score
1          65         5
2          25         6
3          78         4
4          45         7
5          98         8
6          65         9
7          24         2
8          21         5
9          35         6
10         79         3

df2

userID  time_taken  Score
1           78        7
4           54        8
7           23        5
10          96        4

I want to find the intersection between two DFs based on userID and find the mean for rest of the variables.

My output should be,

userID  time_taken  Score
1          71.5       6
4          49.5      7.5
7          23.5      3.5
10         87.5      3.5

Can anybody help me in doing this?

Thanks

print pd.concat([df1[df1['userID'].isin(df2['userID'])], df2]).groupby('userID').mean()

        time_taken  Score
userID                   
1             71.5    6.0
4             49.5    7.5
7             23.5    3.5
10            87.5    3.5

[df1[df1['userID'].isin(df2['userID'] could be [df1, df2] if you don't mind userID inner join.

First you can concat them together as a single data frame and then groupby. Finally take the common elements

>>> index_common = set(df1['userID']).intersection(df2['userID'])
>>> print pd.concat([df2, df1]).groupby(['userID']).mean().ix[index_common]

        time_taken  Score
userID                   
1             71.5    6.0
10            87.5    3.5
4             49.5    7.5
7             23.5    3.5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM