簡體   English   中英

如何用另一個 DataFrame 的值替換 pandas DataFrame?

[英]How to replace pandas DataFrame with the values of another DataFrame?

我有 2 個數據幀( signal_dfprice_df ),可以使用以下代碼生成。

import pandas as pd
import numpy as np

signal_df = pd.DataFrame({
    'long':[
        True ,True, np.nan, True, np.nan
    ],
    'short':[
        np.nan, np.nan, True, np.nan, True  
    ],
    'date':[
        '2020-01-01 19:15', '2020-01-01 20:00', '2020-01-01 22:15', '2020-01-01 22:45', '2020-01-02 00:30',
    ],
})


# convert the column (it's a string) to datetime type
datetime_series = pd.to_datetime(signal_df['date'])
# create datetime index passing the datetime series
datetime_index = pd.DatetimeIndex(datetime_series.values)
signal_df = signal_df.set_index(datetime_index)
signal_df.drop('date',axis=1,inplace=True)
print(signal_df)

price_df = pd.DataFrame({
    'close_price':[
        30,  2,  3, 29,
        15,  6, 19, 56,
        9 , 38, 41, 12,
        23, 14, 15, 16,
        38, 18, 19, 20,
        21, 22, 23, 33,
        25, 26, 10, 28
    ],
    'date':[
        '2020-01-01 19:00', '2020-01-01 19:15', '2020-01-01 19:30', '2020-01-01 19:45', 
        '2020-01-01 20:00', '2020-01-01 20:15', '2020-01-01 20:30', '2020-01-01 20:45', 
        '2020-01-01 21:00', '2020-01-01 21:15', '2020-01-01 21:30', '2020-01-01 21:45', 
        '2020-01-01 22:00', '2020-01-01 22:15', '2020-01-01 22:30', '2020-01-01 22:45', 
        '2020-01-01 23:00', '2020-01-01 23:15', '2020-01-01 23:30', '2020-01-01 23:45', 
        '2020-01-02 00:00', '2020-01-02 00:15', '2020-01-02 00:30', '2020-01-02 00:45', 
        '2020-01-02 01:00', '2020-01-02 01:15', '2020-01-02 01:30', '2020-01-02 01:45', 
    ]
})

# convert the column (it's a string) to datetime type
datetime_series = pd.to_datetime(price_df['date'])
# create datetime index passing the datetime series
datetime_index = pd.DatetimeIndex(datetime_series.values)
price_df = price_df.set_index(datetime_index)
price_df.drop('date',axis=1,inplace=True)
print(price_df)

如何創建第三個 DataFrame ( new_df ),如下所示?

new_df
                     long 
2020-01-01 19:15:00  2
2020-01-01 20:00:00  15
2020-01-01 22:15:00   NaN
2020-01-01 22:45:00  16
2020-01-02 00:30:00   NaN

new_df是通過首先在signal_dflong列中查找True值,然后用price_df中的相應值替換它們來創建的。

您可以嘗試在signal_dflong列中mask True 值

out = (signal_df['long'].mask(signal_df['long'].eq(True),
                              price_df.loc[signal_df.index, 'close_price'])
       .to_frame())
print(out)

                    long
2020-01-01 19:15:00    2
2020-01-01 20:00:00   15
2020-01-01 22:15:00  NaN
2020-01-01 22:45:00   16
2020-01-02 00:30:00  NaN

你可以做:

(price_df.loc[signal_df.where(signal_df.long).index].mask(
         signal_df.long.ne(True)).rename(columns={'close_price':'long'}))

output:

                            long
2020-01-01 19:15:00          2.0
2020-01-01 20:00:00         15.0
2020-01-01 22:15:00          NaN
2020-01-01 22:45:00         16.0
2020-01-02 00:30:00          NaN

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM