简体   繁体   中英

How do I group two dataframes based on a couple of value in python?

I have two similar dataframes U and U1

U
    ID1      ID2  Time A    Friends Distance
0   John    Tom      2         1    4
1   Alex    John     2         0    2
2   Alex    Paul     5         1    3
3   Frank   Richard  1         0    5

U1 
    ID1      ID2    Time B  Friends Distance
0   John    Richard  2         1    0
1   Alex    Frank    2         0    1
2   Alex    Paul     3         1    3
3   Frank   Richard  2         0    5

I would like to have a dataframe that combined ID1 and ID2 based on Time A and Time B :

U2
    ID1     ID2      Time A Time B  Friends Distance
0   John    Tom      2         0       1    4
1   Alex    John     2         0       0    2
2   Alex    Paul     5         3       1    5
3   Frank   Richard  1         2       0    5
4   John    Richard  0         2       1    3
5   Alex    Frank    0         2       0    1

IIUC you can use merge and combine_first . Last remove columns with suffixes _new and fill 0 instead of NaN .

U2 = pd.merge(U,U1, on=['ID1', 'ID2'], how='outer', suffixes=('_new', ''))
U2 = U2.combine_first(U)
U2 = U2.drop(['Friends_new','Distance_new'], axis=1).fillna(0)
U2 = U2[['ID1', 'ID2', 'Time A', 'Time B', 'Friends', 'Distance']]
print U2
    ID1      ID2  Time A  Time B  Friends  Distance
0   John      Tom       2       0        1         4
1   Alex     John       2       0        0         2
2   Alex     Paul       5       3        1         3
3  Frank  Richard       1       2        0         5
4   John  Richard       0       2        1         0
5   Alex    Frank       0       2        0         1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM