简体   繁体   中英

Merging Data Frames on Unique Values

I have 2 data frames. One is a generic "template" with a column of dates that go every hour from now until 4 days from now. The other DF has data in it, such as Latitude and Longitude, it also has a date column but the data is only every 3 hours. I need to combine both data frames so that each lat/lon pair in df2 has an every hour from df1.

DF1                                DF2 
Date                 Shift         Latitude  Longitude   Date                 Temp
2021-10-18 01:00:00  a1            39.9      -99.3       2021-10-18 18:00:00  34
2021-10-18 02:00:00  a2            39.9      -99.3       2021-10-18 21:00:00  36
.....                              .............
2021-10-18 21:00:00  b2            39.9      -99.3       2021-10-19 00:00:00  32

Expected Final Data Frame

Latitude Longitude Date                 Shift           Temp
39.9     -99.3     2021-10-18 01:00:00  a1              NaN
39.9     -99.3     2021-10-18 02:00:00  a1              NaN
.....
39.9     -99.3     2021-10-18 17:00:00  b2              NaN
39.9     -99.3     2021-10-18 18:00:00  b2              34
39.9     -99.3     2021-10-18 19:00:00  b2              NaN

In DF2 there are 3,088 unique pairs of Lat/Lon and each of the unqiue pairs has to have a date column of 4 days, counting hour by hour. My final DF should have 299,536 lines in it.

Use merge with the how and on options. From the pandas docs :

df1 = pd.DataFrame({'a': ['foo', 'bar'], 'b': [1, 2]})
df2 = pd.DataFrame({'a': ['foo', 'baz'], 'c': [3, 4]})
df1.merge(df2, how='inner', on='a')

will give you:

      a  b  c
0   foo  1  3

while using:

df1.merge(df2, how='left', on='a')

will give you:

      a  b  c
0   foo  1  3.0
1   bar  2  NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM