Python Pandas匹配距离另一个Dataframe最近的索引

Question

df.index = 10,100,1000

df2.index = 1,2,11,50,101,500,1001
Just sample

I need to match closest index from df2 compare with df by these conditions 我需要匹配来自df2的最接近的索引与这些条件下的df匹配

df2.index have to > df.index df2.index必须> df.index
only one closest value 只有一个最接近的值

for example output 例如输出

df     |   df2
10     |   11
100    |   101
1000   |   1001

Now I can do it with for-loop and it's extremely slow 现在我可以使用for循环来完成它并且它非常慢

And I used new_df2 to keep index instead of df2 我使用new_df来保持索引而不是df2

new_df2 = pd.DataFrame(columns = ["value"])
for col in df.index:
    for col2 in df2.index:
        if(col2 > col):
            new_df2.loc[col2] = df2.loc[col2]
            break
        else:
            df2 = df2[1:] #delete first row for index speed

How to avoid for-loop in this case Thank. 在这种情况下如何避免for循环谢谢。

Answer 1

Not sure how robust this is, but you can sort df2 so it's index is decreasing, and use asof to find the most recent index label matching each key in df 's index: 不确定这是多么强大，但你可以对df2进行排序，使它的索引正在减少，并使用asof来查找匹配df索引中每个键的最新索引标签：

df2.sort_index(ascending=False, inplace=True)
df['closest_df2'] = df.index.map(lambda x: df2.index.asof(x))

df
Out[19]: 
      a  closest_df2
10    1           11
100   2          101
1000  3         1001

Python Pandas匹配距离另一个Dataframe最近的索引

问题描述

1 个解决方案

解决方案1
4 已采纳 2015-06-03 00:24:25

Python Pandas匹配距离另一个Dataframe最近的索引

问题描述

1 个解决方案

解决方案1 4 已采纳 2015-06-03 00:24:25

解决方案1
4 已采纳 2015-06-03 00:24:25