简体   繁体   English

如何更改时间序列中的索引频率

[英]How to change index frequency in a time series

I am using the yfinance library to import data for a given stock.我正在使用 yfinance 库来导入给定股票的数据。 See code below:请参见下面的代码:

import yfinance as yf
from datetime import datetime as dt
import pandas as pd

# Naming Constants
stock = "AAPL"
start_date = "2014-01-01"
end_date = "2018-01-01"

# Importing all the data into a dataFrame
stock_data = yf.download(stock, start=start_date, end=end_date)

When I call print(stock_data.index) I have the following:当我调用print(stock_data.index)我有以下内容:

DatetimeIndex(['2014-01-02', '2014-01-03', '2014-01-06', '2014-01-07', '2014-01-08', '2014-01-09', '2014-01-10', '2014-01-13', '2014-01-14', '2014-01-15',
               ...
               '2017-12-15', '2017-12-18', '2017-12-19', '2017-12-20', '2017-12-21', '2017-12-22', '2017-12-26', '2017-12-27', '2017-12-28', '2017-12-29'],
              dtype='datetime64[ns]', name='Date', length=1007, freq=None)

I wish to switch the frequency argument from None to daily since every Date refers to a trading day.我希望将频率参数从无切换到每天,因为每个日期都指一个交易日。

When I say stock_data.index.freq = 'B' I get the following error:当我说stock_data.index.freq = 'B'我收到以下错误:

ValueError: Inferred frequency None from passed values does not conform to passed frequency B

And if I put stock_data = stock_data.asfreq('B'), it will change the frequency but it will add certain lines that were not there originally and fills them with NA values.如果我输入 stock_data = stock_data.asfreq('B'),它会改变频率,但它会添加最初不存在的某些行并用 NA 值填充它们。

In other words, what is the offset ALIAS used for trading days?换句话说,用于交易日的偏移 ALIAS 是多少?

You can find the list of alias from the Pandas documentation here: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases您可以在此处从 Pandas 文档中找到别名列表: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases

The error with stock_data.index.freq = 'B' indicates that your timeseries frequency is not 'business-day', but undefined or 'None'. stock_data.index.freq = 'B'的错误表明您的时间序列频率不是“工作日”,而是未定义或“无”。

With

stock_data = stock_data.asfreq('B')

your are re-indexing your timeseries to business-daily frequency: The missing timestamps will be added, and the missing stock data values are set to NaN.您正在将时间序列重新索引为业务每日频率:将添加缺失的时间戳,并将缺失的股票数据值设置为 NaN。 Now you need to decide how replace them, so have a look here: pandas.DataFrame.asfreq .现在您需要决定如何替换它们,因此请看这里: pandas.DataFrame.asfreq So you could replace all NaN's with a fixed value like -999, but in general what you want to do with stock data is take the last valid value at a given point in time, which is forward filling the gaps:因此,您可以将所有 NaN 替换为固定值(例如 -999),但一般而言,您想要对股票数据执行的操作是在给定时间点获取最后一个有效值,这是向前填补空白:

stock_data = stock_data.asfreq('B', method='ffill')

It's always worth reading the docs.总是值得阅读文档。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM