简体   繁体   中英

Combining two dataframes in Pandas on multiple columns when one of the target columns do not have matching values?

I have two dataframes like this

df1 =  time, lat,  lon, lev, val1 
       1     10    10   1    10

df2 =  time, lat, lon, lev, val2
       1     10    10  2     20

where the first four columns are basically coordinates, then I would like to combine/merge them so that the output is this:

df_total =  time, lat,  lon, lev, val1, val2
            1     10    10   1    10    nan
            1     10    10   2    nan   20

I am having trouble since none of the dataframes have matching values in the 'lev' column, but both dataframes have values in 'lev.' When I join on all four columns, the output dataframe is, of course, empty, but when I join on the columns time, lat, and lon, I don't get the behaviour I expect (I get a lev_x and lev_y and it puts the val1 and val2 in the same row). So, how can this be done?

Use from this code

a = pd.concat([df1, df2], ignore_index=True)

Merely do the following:

import pandas as pd

df1 = pd.DataFrame({'time': [1], 'lat':10, 'lon':10, 'lev':1, 'val1':10})

df2 = pd.DataFrame({'time': [1], 'lat':10, 'lon':10, 'lev':2, 'val2':20})

df = df1.append(df2)

Result

   time  lat  lon  lev  val1  val2
0     1   10   10    1  10.0   NaN
0     1   10   10    2   NaN  20.0

if you absolutely want to convert all non-null elements to integers consider using instead:

df = df1.append(df2).astype('Int64')

#    time  lat  lon  lev  val1  val2
# 0     1   10   10    1    10  <NA>
# 0     1   10   10    2  <NA>    20

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM