简体   繁体   中英

Python: how to merge to data frames based on different number of columns?

I have two data frames the first one df1 that contains information about a place while the second one df2 count the interactions between two different places.

df1: 
     ID   x   y
0    0    5   2
1    1    2   3
2    2    3   6
3    3    0   1
4    4    9   8

df2: 
    ID1  ID2  t
0    1    4   20
1    1    2   33
2    2    3   64
3    3    4   13
4    1    3   80
5    11   2   34

I would like to merge the two dataframe based on df1 and having something like that

df3: 
    ID1  ID2  t    x1  y1  x2  y2
0    1    4   20   2   3   9   8
1    1    2   33   2   3   3   6
2    2    3   64   3   6   0   1
3    3    4   13   0   1   9   8
4    1    3   80   1   3   0   1
5    11   2   34  NaN NaN  3   6

The NaN values are caused by the fact the the place ID 11 is not in df1

Try this:

In [36]: df2.merge(df1, left_on='ID1', right_on='ID', how='left') \
            .merge(df1, left_on='ID2', right_on='ID', how='left', suffixes=['','_2']) \
            .drop(['ID', 'ID_2'], 1)
Out[36]:
   ID1  ID2   t    x    y  x_2  y_2
0    1    4  20  2.0  3.0    9    8
1    1    2  33  2.0  3.0    3    6
2    2    3  64  3.0  6.0    0    1
3    3    4  13  0.0  1.0    9    8
4    1    3  80  2.0  3.0    0    1
5   11    2  34  NaN  NaN    3    6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM