I have 2 data frames. One is a generic "template" with a column of dates that go every hour from now until 4 days from now. The other DF has data in it, such as Latitude and Longitude, it also has a date column but the data is only every 3 hours. I need to combine both data frames so that each lat/lon pair in df2 has an every hour from df1.
DF1 DF2
Date Shift Latitude Longitude Date Temp
2021-10-18 01:00:00 a1 39.9 -99.3 2021-10-18 18:00:00 34
2021-10-18 02:00:00 a2 39.9 -99.3 2021-10-18 21:00:00 36
..... .............
2021-10-18 21:00:00 b2 39.9 -99.3 2021-10-19 00:00:00 32
Expected Final Data Frame
Latitude Longitude Date Shift Temp
39.9 -99.3 2021-10-18 01:00:00 a1 NaN
39.9 -99.3 2021-10-18 02:00:00 a1 NaN
.....
39.9 -99.3 2021-10-18 17:00:00 b2 NaN
39.9 -99.3 2021-10-18 18:00:00 b2 34
39.9 -99.3 2021-10-18 19:00:00 b2 NaN
In DF2 there are 3,088 unique pairs of Lat/Lon and each of the unqiue pairs has to have a date column of 4 days, counting hour by hour. My final DF should have 299,536 lines in it.
Use merge with the how and on options. From the pandas docs :
df1 = pd.DataFrame({'a': ['foo', 'bar'], 'b': [1, 2]})
df2 = pd.DataFrame({'a': ['foo', 'baz'], 'c': [3, 4]})
df1.merge(df2, how='inner', on='a')
will give you:
a b c
0 foo 1 3
while using:
df1.merge(df2, how='left', on='a')
will give you:
a b c
0 foo 1 3.0
1 bar 2 NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.