簡體   English   中英

根據 pandas DataFrame 列中最接近的值更新 numpy 數組

[英]Update numpy array based on nearest value in pandas DataFrame column

如何根據 pandas DataFrame 列中最接近的值更新數組? 例如,我想根據 pandas DataFrame 中的“時間”列更新以下數組,以便該數組現在包含“X”值:

輸入數組:

a = np.array([
    [122.25, 225.00, 201.00],
    [125.00, 151.50, 160.62],
    [99.99, 142.25, 250.01],
])

輸入 DataFrame:

df = pd.DataFrame({
    'Time': [100, 125, 150, 175, 200, 225],
    'X': [26100, 26200, 26300, 26000, 25900, 25800],
})

預期 output 數組:

([
    [26200, 25800, 25900],
    [26200, 26300, 26300],
    [26100, 26300, 25800],
])

使用merge_asof

# Convert Time to float since your input array is float.
# merge_asof requires both sides to have the same data types
df['Time'] = df['Time'].astype('float')

# merge_asof also requires both data frames to be sorted by the join key (Time)
# So we need to flatten the input array and make note of the original order
# before going into the merge
a_ = np.ravel(a)
o_ = np.arange(len(a_))

tmp = pd.DataFrame({
    'Time': a_,
    'Order': o_
})

# Merge the two data frames and extract X in the original order
result = (
    pd.merge_asof(tmp.sort_values('Time'), df.sort_values('Time'), on='Time', direction='nearest')
        .sort_values('Order')
        ['X'].to_numpy()
        .reshape(a.shape)
)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM