简体   繁体   中英

how to join 2 dataframe on datetime index

Consider Toy dataframe

Dataframe 1:

d = {'DateTime': ['2007-01-01 00:00:00', '2007-01-01 10:00:00', '2007-01-01 16:00:00', 
                  '2012-01-03 10:00:00', '2012-01-03 12:00:00', '2015-01-02 00:00:00', 
                  '2017-01-03 23:00:00'],
'x': [1, 2, 3, 4, 5, 6, 7]}
df1 = pd.DataFrame(d)
df1.set_index(['DateTime'], inplace=True)
df1.index = pd.to_datetime(df1.index)

Dataframe 2:

d = {'dat': ['2007-01-01 ', '2015-01-02'],'y': [1, 1]}
df2 = pd.DataFrame(d)
df2.set_index(['dat'], inplace=True)
df2.index = pd.to_datetime(df2.index)

Desired output:

DateTime                     X       Y
2007-01-01 00:00:00          1       1         
2007-01-01 10:00:00          2       1
2007-01-01 16:00:00          3       1
2012-01-03 10:00:00          4       0
2012-01-03 12:00:00          5       0
2015-01-02 00:00:00          6       1
2017-01-03 23:00:00          7       0

I tried this but it removed the rows for which zero was inputed and also the values of hour of same day ie '1' was not repeated over each hour

result = pd.merge(df1, df2, how="inner", on=["DateTime"])

Here is a solution you can try out,

df1['DateTime'] = pd.to_datetime(df1['DateTime'])
df2['dat'] = pd.to_datetime(df2['dat'])

df1['y'] = (
    df1['DateTime'].dt.date.map(df2.set_index('dat')['y'].to_dict()).fillna(0)
)

             DateTime  x    y
0 2007-01-01 00:00:00  1  1.0
1 2007-01-01 10:00:00  2  1.0
2 2007-01-01 16:00:00  3  1.0
3 2012-01-03 10:00:00  4  0.0
4 2012-01-03 12:00:00  5  0.0
5 2015-01-02 00:00:00  6  1.0
6 2017-01-03 23:00:00  7  0.0

Below I added the code with merge (left join) which can be used to merge tow data frames. I used your codes and added last lines to generate the output. Then NaN replaced with 0

 def test_join():
  d = {'DateTime': ['2007-01-01 00:00:00', '2007-01-01 10:00:00', '2007-01-01 16:00:00', 
              '2012-01-03 10:00:00', '2012-01-03 12:00:00', '2015-01-02 00:00:00', 
              '2017-01-03 23:00:00'],
 'x': [1, 2, 3, 4, 5, 6, 7]}
 df1 = pd.DataFrame(d)
 df1.set_index(['DateTime'], inplace=True)
 df1.index = pd.to_datetime(df1.index)
 print(df1)

 d = {'DateTime': ['2007-01-01 ', '2015-01-02'],'y': [1, 1]}
 df2 = pd.DataFrame(d)
 df2.set_index(['DateTime'], inplace=True)
 df2.index = pd.to_datetime(df2.index)

 #merge with let join and replace NaN from 0 as left oin gives nulls when no records.
 df = df1.merge(df2, on='DateTime', how='left')
 print(df.fillna(0))

test_join()

输出:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM