简体   繁体   English

如何通过dataframe的几列循环ARIMA

[英]How to loop ARIMA through several columns of dataframe

I will start this by saying I am in no way a Python expert but my current project demands that it be programmed in Python, so any help and guidance is appreciated.我首先要说我绝不是 Python 专家,但我当前的项目要求它在 Python 中进行编程,因此感谢任何帮助和指导。

I have is a timeseries with daily data and 2000+ items.我有一个包含每日数据和 2000 多个项目的时间序列。

I wish to run arima for each of these 2000+ columns.我希望为这 2000 多个列中的每一个运行 arima。 They are not dependent on each other.他们彼此不依赖。 So basically it's like running 2000+ independent Arima analyses.所以基本上就像运行 2000 多个独立的 Arima 分析。

I have written a piece of code that loops through the columns and trains as per the parameters (order) provided to it.我已经编写了一段代码,根据提供给它的参数(顺序)循环遍历列和训练。 But it looks like as it trains further in the columns, it forgets what was learnt by the model before.但是看起来随着它在列中进一步训练,它忘记了 model 之前学到的东西。 Is there a way where I can make changes to the code where the trained results can be stored and used to the predict on the test set?有没有一种方法可以更改代码,在其中可以存储经过训练的结果并将其用于测试集的预测? Trying to predict for a couple of values in the last column.尝试预测最后一列中的几个值。 I am trying to train ARIMA for all the columns (Col1 to Col8 and predict couple of last values for Col 9)我正在尝试为所有列训练 ARIMA(Col1 到 Col8 并预测 Col 9 的最后几个值)

Sample Dataset:示例数据集:

date                   Col1      Col2       Col3      Col4       Col5       Col6        Col7      Col8      Col9
2022-01-02 10:30:00     24         24        24.8      24.8       25         25         25.5      26.3      26.9   
2022-01-02 10:45:00     59         58         60       60.3       59.3       59.2       58.4      56.9      58.0   
2022-01-02 11:00:00     43.7       43.9       48        48        48.1       48.9       49        49.5      49.5   
#Test Train Split
train = df.iloc[:, :]
test = df.iloc[90:,-1]

order = (1,2,1) # <- plug-in p, d, q here 
for col in train.columns:
  model = ARIMA(train[col], order = order)
  model.initialize_approximate_diffuse()
  model = model.fit()
model.summary()

predictions = model.predict(len(test))

I think you have to do something instead of fitting the model inside the loop.我认为你必须做一些事情而不是将 model 放入循环中。

Let's know if this code works on your side让我们知道此代码是否适合您

#Test Train Split
train = df.iloc[:, :]
test = df.iloc[90:,-1]

order = (1,2,1) # <- plug-in p, d, q here 

models=[]    # ---------> We create a list of different models here

for col in train.columns:
  model = ARIMA(train[col], order = order)
  model.initialize_approximate_diffuse()

  # model = model.fit() ----------> We replace this by :
  models.append(model.fit())

models[0].summary() #----------------> show first model summary

# Create a list of predictions

predictions=[]

for i in range(len(models)):
  predictions.append(models[i].predict(len(test)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM