简体   繁体   English

Pandas 移动平均计算缺少日期?

[英]Pandas moving average calculation missing dates?

I am trying to make a simple moving average using the live forex data from Alpha Vantage (API key can be registered for free from Alpha Vantage).我正在尝试使用来自Alpha Vantage的实时外汇数据制作一个简单的移动平均线(API 密钥可以从 Alpha Vantage 免费注册)。 Everything seems to work fine except that the period for the SMA is not complete.除了 SMA 的周期不完整之外,一切似乎都运行良好。

For example, if I set data[:'2020-1-1'] , it should return data from 1.1.2020 up to current date.例如,如果我设置data[:'2020-1-1'] ,它应该返回从 1.1.2020 到当前日期的数据。 However, what happens is that the the period from December 2020 to January 2021 is gone.然而,2020 年 12 月至 2021 年 1 月的这段时间已经一去不复返了。

I tried plotting the graph and realize that the larger my moving average period is, the more recent data are being removed.我尝试绘制图表并意识到我的移动平均周期越大,删除的数据越多。 The plot on my graph is as follows:我图上的 plot 如下:

GBP USD英镑美元

Dataframe for beginning period 2020-1-1 Dataframe 用于 2020-1-1 期初

Dataframe period fixed but NAN values Dataframe 周期固定但 NAN 值

Below are my codes in 3 separate files:以下是我在 3 个单独文件中的代码:

This is the execution.py file:这是 execution.py 文件:

from alpha_vantage.foreignexchange import ForeignExchange
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from to_USD import currencyExchangeB
from smvgavg import sma

UniSymbol = 'USD'
fromSymbolsB = ['EUR','GBP']

for symbol in fromSymbolsB:
    # store the result that return from "currencyExchange()" function
    result = currencyExchangeB(UniSymbol,symbol)

    # generate graph by passing those result of each currency data
    sma(result,UniSymbol,symbol)

The next one is the to_USD.py file where I use it to pull the foreign currency data from Alpha Vantage:下一个是 to_USD.py 文件,我用它从 Alpha Vantage 中提取外币数据:

from alpha_vantage.foreignexchange import ForeignExchange
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import datetime

# added a class to collect a collection of variable so it can be return easily in the following function
class GRAPH_STRUCT:
    date:any
    symbol:str
    def __init__(self,date,symbol):
        self.date = date
        self.symbol = symbol

def currencyExchangeB(toCurrencyB,fromCurrencyB):
    # create an array to store the result
    result = []

    cc = ForeignExchange(key='%ALPHA_VANTAGE_APIKEY%',output_format='pandas')
    data, meta_data = cc.get_currency_exchange_daily(from_symbol=fromCurrencyB,to_symbol=toCurrencyB,outputsize='full')

    # append those result in the array
    result = GRAPH_STRUCT(data[:'2020-1-1'],toCurrencyB)

    # return those result in the end of the function
    return result

The last file is the smvgavg.py file used to calculate the simple moving average:最后一个文件是用于计算简单移动平均线的 smvgavg.py 文件:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

#dataObj is a datatype of GRAPH_STRUCT
def sma(dataObj,UniSymbol,symbol):
    maA = 25
    maB = 50
    maT = [maA,maB]
    for ma in maT:
        smaString = "SMA" + str(ma)

        data = dataObj.date
        data[smaString] = data.iloc[:,3].rolling(window = ma).mean()
        data = data.iloc[ma:]

        print(data)
        fig = plt.gcf()
        fig.set_size_inches(10, 6)
        plt.plot(data['4. close'], label='Close' if ma == 0 else "",color='red')
        plt.plot(data['SMA' + str(ma)],label='SMA' + str(ma))

    plt.title(symbol + '/' + UniSymbol, y=1)
    plt.xlabel("Date")
    plt.ylabel("Exchange Rate")
    plt.legend()
    plt.show()

Any help would be appreciated.任何帮助,将不胜感激。 Thanks.谢谢。

This isn't because of pandas or Python, rather, it's because of how the SMA is calculated.这不是因为pandas或 Python,而是因为 SMA 的计算方式。 Unless you do the computation yourself just to see what's going on, it can be kind of tricky.除非您自己进行计算只是为了看看发生了什么,否则这可能有点棘手。

The formula for the simple moving average is to take n amount of closing prices at n interval.简单移动平均线的公式是在n 个区间内取n个收盘价。 And divide by n .并除以n Where P is the closing price and the number is an even interval in your n -day SMA, one period of a 5-day SMA would look something like this:其中P是收盘价,数字是n天 SMA 的偶数区间,5 天 SMA 的一个周期看起来像这样:

(P1 + P2 + P3 + P4 + P5)/5

The reason some of your starting dates appear to be missing is because the data point calculated from a period is placed at the end of that period.您的某些开始日期似乎丢失的原因是因为从某个时期计算的数据点位于该时期的末尾 For example, let's have a look at Airbnb (ABNB) on TD Ameritrade's thinkorswim .例如,让我们看看 TD Ameritrade 的thinkorswim上的Airbnb (ABNB) This stock was just recently listed, so we can easily see that this 50-day SMA (purple line) also has some dates that appear to be missing:这只股票最近刚刚上市,所以我们可以很容易地看到这条 50 天 SMA(紫色线)有一些日期似乎缺失: 在此处输入图像描述

What you can do is get earlier information that you are not going to display in your graph.可以做的是获取不会在图表中显示的早期信息。

For example, if you want to display 10-day SMA line for the dates December 2020 to February 2021, you could factor in the data 10 days before your start date, and then just graph your start date to the end date.例如,如果您想显示 2020 年 12 月至 2021 年 2 月日期的 10 天 SMA 线,您可以考虑开始日期前 10 天的数据,然后仅绘制开始日期到结束日期的图表。 This would fill in that area that appears to be missing because your calculations now account for the previous period (that won't be graphed).这将填补似乎缺失的区域,因为您的计算现在考虑了上一时期(不会绘制图表)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM