简体   繁体   中英

time series stock data having gaps in dataframe to be modeled in Pycaret

I have aa csv file which I have imported as follows:

ps0pyc=pd.read_csv(r'/Users/swapnilgupta/Desktop/fend/p0.csv')
ps0pyc['Date'] = pd.to_datetime(ps0pyc['Date'], dayfirst= True)
ps0pyc

    Date    PORTVAL
0   2013-01-03  17.133585
1   2013-01-04  17.130434
2   2013-01-07  17.396581
3   2013-01-08  17.308323
4   2013-01-09  17.475933
... ... ...
2262    2021-12-28  205.214555
2263    2021-12-29  204.076193
2264    2021-12-30  203.615507
2265    2021-12-31  201.143990
2266    2022-01-03  204.867302
2267 rows × 2 columns

It is a dataframe time series, ie stock data which has approx 252 trading days per year ranging from 2013 to 2022 I am trying to apply time series module of PyCaret over it only problem which I encounter is that PyCaret doesn't support modeling for daily data with missing values, and my dataset has stock data per year of 252 days and not continuous 366/365 days

What is alternate solution to this and how should i use such data with gaps in Pycaret time series module?

Set index to your dataframe

ps0pyc.set_index('Date',inplace=True)

**Create a new continuous index for the period **

new_idx = pd.date_range('01-01-2013', '01-01-2023')

Reindex your dataframe

reindexing your dataframe to newly created index

ps0pyc = ps0pyc.reindex(new_idx , fill_value=0)

You can also forward fill or back fill with

ps0pyc = ps0pyc['PORTVAL'].ffill(inplace=True)
#or
ps0pyc = ps0pyc['PORTVAL'].bfill(inplace=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM