简体   繁体   中英

Prediction on time series analysis using ARIMA in R

I am new to programming and am attempting to create a prediction model for multiple articles. Unfortunately, using Excel or similar software is not possible for this task. Therefore, I have installed Rstudio to solve this problem. My goal is to make a 18-month prediction for each article in my dataset using an ARIMA model.

However, I am currently facing an issue with the format of my data frame. Specifically, I am unsure of how my CSV should be structured to be read by my code.

I have attached an image of my current dataset in CSV format: https://i.stack.imgur.com/AQJx1.png

Here is my dput(sales_data): structure(list(X.Article.1.Article.2.Article.3 = c("janv-19;42;49;55", "f\xe9vr-19;56;58;38", "mars-19;55;59;76")), class = "data.frame", row.names = c(NA, -3L))

And also provided the code I have constructed so far with the help of blogs and websites:

library(forecast)
library(reshape2)

sales_data <- read.csv("sales_data.csv", header = TRUE)

sales_data_long <- reshape2::melt(sales_data, id.vars = "Code Article")

for(i in 1:nrow(sales_data_long)) {
  
  sales_data_article <- subset(sales_data_long, sales_data_long$`Code Article` == sales_data_long[i,"Code Article"])
  
  sales_ts <- ts(sales_data_article$value, start = c(2010,6), frequency = 12)
  
  arima_fit <- auto

  arima_forecast <- forecast(arima_fit, h = 18)
  
  print(arima_forecast)
  print("Article: ", Code article[i])
}

With this code, RStudio gives me the following error: "Error: id variables not found in data: Code Article"

Currently, I am not interested in generating any plots or outputs. My main focus is on identifying the appropriate format for my data.

Do I need to modify my CSV file and separate each column using "," or ";"? Or, can I keep my data in its current format and make adjustments in the code instead?

Added the dput output as per jrcalabrese request. Swapped to the replacement for reshape2 (tidyr). Used pivot_longer. Now doesn't give error, which was happening in reshape2::melt. It doesn't matter so much what the csv structure is. Your structure was fine. Hope this helps: :-)

library(tidyr)
sales_data <- structure(list(var1 = c("Article 1", "Article 2", "Article 3"),
`janv-19` = c(42, 56, 55),
`fev-19` = c(49, 58, 59),
`mars-19` = c(55, 38, 76)),
row.names = c(NA, 3L), class = "data.frame")

sales_data_long <- sales_data |> pivot_longer(!var1,
                                              names_to = "month",
                                              values_to = "count")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM