Prediction on time series analysis using ARIMA in R

Question

I am new to programming and am attempting to create a prediction model for multiple articles. Unfortunately, using Excel or similar software is not possible for this task. Therefore, I have installed Rstudio to solve this problem. My goal is to make a 18-month prediction for each article in my dataset using an ARIMA model.

However, I am currently facing an issue with the format of my data frame. Specifically, I am unsure of how my CSV should be structured to be read by my code.

I have attached an image of my current dataset in CSV format: https://i.stack.imgur.com/AQJx1.png

Here is my dput(sales_data): structure(list(X.Article.1.Article.2.Article.3 = c("janv-19;42;49;55", "f\xe9vr-19;56;58;38", "mars-19;55;59;76")), class = "data.frame", row.names = c(NA, -3L))

And also provided the code I have constructed so far with the help of blogs and websites:

library(forecast)
library(reshape2)

sales_data <- read.csv("sales_data.csv", header = TRUE)

sales_data_long <- reshape2::melt(sales_data, id.vars = "Code Article")

for(i in 1:nrow(sales_data_long)) {
  
  sales_data_article <- subset(sales_data_long, sales_data_long$`Code Article` == sales_data_long[i,"Code Article"])
  
  sales_ts <- ts(sales_data_article$value, start = c(2010,6), frequency = 12)
  
  arima_fit <- auto

  arima_forecast <- forecast(arima_fit, h = 18)
  
  print(arima_forecast)
  print("Article: ", Code article[i])
}

With this code, RStudio gives me the following error: "Error: id variables not found in data: Code Article"

Currently, I am not interested in generating any plots or outputs. My main focus is on identifying the appropriate format for my data.

Do I need to modify my CSV file and separate each column using "," or ";"? Or, can I keep my data in its current format and make adjustments in the code instead?

Answer 1

Added the dput output as per jrcalabrese request. Swapped to the replacement for reshape2 (tidyr). Used pivot_longer. Now doesn't give error, which was happening in reshape2::melt. It doesn't matter so much what the csv structure is. Your structure was fine. Hope this helps: :-)

library(tidyr)
sales_data <- structure(list(var1 = c("Article 1", "Article 2", "Article 3"),
`janv-19` = c(42, 56, 55),
`fev-19` = c(49, 58, 59),
`mars-19` = c(55, 38, 76)),
row.names = c(NA, 3L), class = "data.frame")

sales_data_long <- sales_data |> pivot_longer(!var1,
                                              names_to = "month",
                                              values_to = "count")

Prediction on time series analysis using ARIMA in R

Question

1 answers

solution1
0 2023-01-26 22:29:56

Prediction on time series analysis using ARIMA in R

Question

1 answers

solution1 0 2023-01-26 22:29:56

solution1
0 2023-01-26 22:29:56