简体繁体 English

使用多元时间序列预测的需求预测

[英]Demand Forecasting using multivariate time Series forecasting

原文 2020-07-24 07:59:23 1 1 python/ time-series/ arima

I have a multivariate time series data which has fields Order_date, store_id, region, product_ID, Unit_sold, discount, holiday(yes/no) etc. Number of unique products is 50. I need to perform demand forecasting of each product.我有一个多元时间序列数据，其中包含 Order_date、store_id、region、product_ID、Unit_sold、discount、holiday(yes/no) 等字段。唯一产品的数量是 50。我需要对每个产品进行需求预测。 I want to apply SARIMAX model on this dataset.我想在这个数据集上应用 SARIMAX model。

Do i need to build individual forecast model for each of the products seperately or there is some workaround to deal with forecasting of multiple products together?我是否需要分别为每个产品建立单独的预测 model 或者有一些解决方法可以同时处理多个产品的预测？

Another aspect: How should i check for the stationarity of multivariate time series.另一方面：我应该如何检查多元时间序列的平稳性。 I came across adf test which works for univariate data and Johansen's test which can work upto 12 independent variabes.我遇到了适用于单变量数据的 adf 测试和可以处理多达 12 个独立变量的 Johansen 测试。 Is Johansen's test the best way of checking stationarity of multivariate time series.约翰森检验是检验多元时间序列平稳性的最佳方法吗？

I am a beginner in time series.我是时间序列的初学者。 Please guide me through the steps.请指导我完成这些步骤。

1 个解决方案

Let's approach this with an example.让我们用一个例子来解决这个问题。 Suppose you sell sweaters, IKEA furniture, and ice cream.假设您销售毛衣、宜家家具和冰淇淋。 Logically, sweaters will sell best just before and during winter, IKEA furniture sells best during weekends, but is fairly even throughout the year, and ice cream sells best in summer, but mostly when it's hot.从逻辑上讲，毛衣在冬天之前和冬天卖得最好，宜家家具在周末卖得最好，但全年都相当均匀，冰淇淋在夏天卖得最好，但主要是在炎热的时候。 If you fit a time series model to all these at once, even though the products might all show trends with the same periodicities, their impacts will be completely opposite!如果你将时间序列 model 一次拟合到所有这些，即使产品可能都显示具有相同周期的趋势，它们的影响将完全相反！

Of course more people buy ice cream, sweaters, and furniture during weekends, but the impact of it being a weekend will be much larger for the last one than for others.当然，更多的人在周末购买冰淇淋、毛衣和家具，但作为周末的影响，对最后一个的影响会比其他人大得多。 And sweaters and ice cream probably both show yearly trends, but in opposite directions.毛衣和冰淇淋可能都显示出年度趋势，但方向相反。

I'd advise you to build a model for one product, then look into automating the process, and for the rest of the products, just review the results of the automation process.我建议您为一个产品构建 model，然后研究自动化过程，对于产品的 rest，只需查看自动化过程的结果。

Although some of us have mathematical backgrounds, asking about which (statistical) tests is the best is bound to get subjective, complex answers, since that really depends on the situation.尽管我们中的一些人有数学背景，但询问哪种（统计）测试是最好的肯定会得到主观、复杂的答案，因为这真的取决于具体情况。 Supposing that you're working for a business - in my experience it's often sufficient to get a good enough answer instead of a perfect one.假设您正在为一家企业工作-根据我的经验，获得足够好的答案而不是完美的答案通常就足够了。 Yang and Shahabi use Johansen's test, for example, and talk about stationarizing non-stationary multivariate time series if they fail it.例如， Yang 和 Shahabi使用 Johansen 的检验，如果他们失败了，他们会讨论如何使非平稳多元时间序列平稳化。

In the end, the main way you'll find out whether an approach worked is through trial and error.最后，你会发现一种方法是否有效的主要方法是通过反复试验。 If you use Johansen's test, the series passes it, but you see in the results that the predictions get worse over time, then the time series apparently wasn't stationary.如果您使用 Johansen 的检验，则该序列通过了它，但您在结果中看到预测随着时间的推移变得更糟，那么时间序列显然不是平稳的。 If you want a more mathematically correct answer, or if you're not working in a business environment, I'd pose the second question at CrossValidated, which has similar queries.如果您想要一个数学上更正确的答案，或者您不在商业环境中工作，我会在 CrossValidated 提出第二个问题，它有类似的查询。