简体   繁体   中英

How to replace pandas DataFrame with the values of another DataFrame?

I have 2 DataFrames ( signal_df , and price_df ) that can be generated using the following code.

import pandas as pd
import numpy as np

signal_df = pd.DataFrame({
    'long':[
        True ,True, np.nan, True, np.nan
    ],
    'short':[
        np.nan, np.nan, True, np.nan, True  
    ],
    'date':[
        '2020-01-01 19:15', '2020-01-01 20:00', '2020-01-01 22:15', '2020-01-01 22:45', '2020-01-02 00:30',
    ],
})


# convert the column (it's a string) to datetime type
datetime_series = pd.to_datetime(signal_df['date'])
# create datetime index passing the datetime series
datetime_index = pd.DatetimeIndex(datetime_series.values)
signal_df = signal_df.set_index(datetime_index)
signal_df.drop('date',axis=1,inplace=True)
print(signal_df)

price_df = pd.DataFrame({
    'close_price':[
        30,  2,  3, 29,
        15,  6, 19, 56,
        9 , 38, 41, 12,
        23, 14, 15, 16,
        38, 18, 19, 20,
        21, 22, 23, 33,
        25, 26, 10, 28
    ],
    'date':[
        '2020-01-01 19:00', '2020-01-01 19:15', '2020-01-01 19:30', '2020-01-01 19:45', 
        '2020-01-01 20:00', '2020-01-01 20:15', '2020-01-01 20:30', '2020-01-01 20:45', 
        '2020-01-01 21:00', '2020-01-01 21:15', '2020-01-01 21:30', '2020-01-01 21:45', 
        '2020-01-01 22:00', '2020-01-01 22:15', '2020-01-01 22:30', '2020-01-01 22:45', 
        '2020-01-01 23:00', '2020-01-01 23:15', '2020-01-01 23:30', '2020-01-01 23:45', 
        '2020-01-02 00:00', '2020-01-02 00:15', '2020-01-02 00:30', '2020-01-02 00:45', 
        '2020-01-02 01:00', '2020-01-02 01:15', '2020-01-02 01:30', '2020-01-02 01:45', 
    ]
})

# convert the column (it's a string) to datetime type
datetime_series = pd.to_datetime(price_df['date'])
# create datetime index passing the datetime series
datetime_index = pd.DatetimeIndex(datetime_series.values)
price_df = price_df.set_index(datetime_index)
price_df.drop('date',axis=1,inplace=True)
print(price_df)

How can I create a third DataFrame ( new_df ) that looks like the following?

new_df
                     long 
2020-01-01 19:15:00  2
2020-01-01 20:00:00  15
2020-01-01 22:15:00   NaN
2020-01-01 22:45:00  16
2020-01-02 00:30:00   NaN

new_df is created by first looking for the True values in signal_df 's long column, then replacing them with the corresponding values in price_df .

You can try mask the True value in long column of signal_df

out = (signal_df['long'].mask(signal_df['long'].eq(True),
                              price_df.loc[signal_df.index, 'close_price'])
       .to_frame())
print(out)

                    long
2020-01-01 19:15:00    2
2020-01-01 20:00:00   15
2020-01-01 22:15:00  NaN
2020-01-01 22:45:00   16
2020-01-02 00:30:00  NaN

You can do:

(price_df.loc[signal_df.where(signal_df.long).index].mask(
         signal_df.long.ne(True)).rename(columns={'close_price':'long'}))

output:

                            long
2020-01-01 19:15:00          2.0
2020-01-01 20:00:00         15.0
2020-01-01 22:15:00          NaN
2020-01-01 22:45:00         16.0
2020-01-02 00:30:00          NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM