简体   繁体   中英

Pandas: merge_asof-like solutions for merging two multi-indexed DataFrames?

I have two dataframes, df1 and df2 say, which are both multi-indexed.

At the first index level, both dataframes share the same keys (ie df1.index.get_level_values(0) and df2.index.get_level_values(0) contain the same elements). Those keys are unordered strings, such as ['foo','bar','baz'] .

At the second index level, both dataframes have timestamps which are ordered, but unequally spaced.

My question is as follows. I would like to merge df1 and df2 in such a way that, for each key at level 1, the values of df2 should be inserted into df1 without changing the order of df1 .

I tried using pd.merge , pd.merge_asof and pd.MultiIndex.searchsorted . From the descriptions of those methods, it seems like one of them should do the trick for me, but I cannot figure out how. Ideally, I would like to find a solution that avoids looping over the keys in index.get_level_values(0) , since my dataframes can get large.

A few failed attempts for illustration:

df_merged = pd.merge(left=df1.reset_index(), right=df2.reset_index(),
                     left_on=[['some_keys', 'timestamps_df1']], right_on=[['some_keys', 'timestamps_df2']],
                     suffixes=('', '_2')
                     ) # after sorting
# FAILED

df2.index.searchsorted(df1, side='right') # after sorting
# FAILED

Any help is greatly appreciated!

Base on your description , here is the solution from merge_asof

df_merged = pd.merge_asof(left=df1.reset_index(), right=df2.reset_index(),
                     left_on=['timestamps_df1'], right_on=['timestamps_df2'],by='some_keys',
                     suffixes=('', '_2')
                     )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM