I have two datasets, one with individual reports and one with regional conditions. There are many more individual rows than regional, but I want to append the regional data onto each individual. The problem I am facing is that I must merge using two primary keys, eg
Individual - 5000 rows
Code | Time | Data1 | Data2 | Data3
Regional - 100 rows
Code | Time | RData1 | RData2
--I have attemped and failed using:
df = individual.merge(regional, how='left', on=['Code', 'Time'])
--Which leaves RData1,2 as null values in the new df, which does, to its credit look like
df - 5000 rows
Code | Time | Data1 | Data2 | Data3 | RData1 | RData2
but the null values don't help me...
Data
Generate random df
rng = pd.date_range('2015-02-24', periods=5, freq='T')
df = pd.DataFrame({ 'Time': rng, 'data1': np.random.randn(len(rng)),'code':[201, 897,345, 70,879] })
df.set_index(['Time','code'], inplace=True)
df
Generate random df1
df1 = pd.DataFrame({ 'Time': rng, 'data1': np.random.randn(len(rng)),'code':[201, 30,345, 70,879] })
df1.set_index(['Time','code'], inplace=True)
df1
merge on indexes can be done as follows
result =df1.merge(df, left_index=True, right_index=True, suffixes=('_Left','_Right'))
result
Or better
result =pd.merge(df, df1,left_index=True, right_index=True, suffixes=('_Left','_Right'))
result
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.