I have two DFs.
df1
userID time_taken Score
1 65 5
2 25 6
3 78 4
4 45 7
5 98 8
6 65 9
7 24 2
8 21 5
9 35 6
10 79 3
df2
userID time_taken Score
1 78 7
4 54 8
7 23 5
10 96 4
I want to find the intersection between two DFs based on userID and find the mean for rest of the variables.
My output should be,
userID time_taken Score
1 71.5 6
4 49.5 7.5
7 23.5 3.5
10 87.5 3.5
Can anybody help me in doing this?
Thanks
print pd.concat([df1[df1['userID'].isin(df2['userID'])], df2]).groupby('userID').mean()
time_taken Score
userID
1 71.5 6.0
4 49.5 7.5
7 23.5 3.5
10 87.5 3.5
[df1[df1['userID'].isin(df2['userID']
could be [df1, df2]
if you don't mind userID
inner join.
First you can concat them together as a single data frame and then groupby. Finally take the common elements
>>> index_common = set(df1['userID']).intersection(df2['userID'])
>>> print pd.concat([df2, df1]).groupby(['userID']).mean().ix[index_common]
time_taken Score
userID
1 71.5 6.0
10 87.5 3.5
4 49.5 7.5
7 23.5 3.5
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.