简体   繁体   English

每日数据的统计分析

[英]Statistical analysis on daily data

I have a number of data points that I am trying to extract a meaningful pattern from (or derive an equation that could then be predictive). 我有许多数据点试图从中提取出有意义的模式(或得出可以预测的方程式)。 I am trying to find a correlation (?) between RANK and DAILY SALES for any given ITEM. 我正在尝试为任何给定的ITEM查找RANK与每日销售额之间的相关性(?)。

So, for any given item, I have (say) two weeks of daily information, each day consists of a pairing of Inventory, and Rank. 因此,对于任何给定的项目,我都有(例如)两周的每日信息,每天包括成对的库存和排名。

ITEM #1
Monday: 20 in stock (rank 30)
Tuesday: 17 in stock (rank 29)
Wednesday: 14 in stock (rank 31)

The presumption is that 3 items were sold each day, and that selling ~3 a day is roughly what it means to have a rank of ~30. 假设每天售出3件商品,而每天售出约3件商品,则意味着排名在30位左右。

Given information like this across a wide span (20,000 items, over 2 weeks) of inventory/rank/date pairings, I'd like to derive an equation/method of estimating what the daily sales would be for any given rank. 给定类似的信息,这些信息涉及广泛的库存/等级/日期配对(20,000件,超过2周),我想导出一个方程式/方法来估算任何给定排名的每日销售额。

There's one problem: 有一个问题:

The data isn't entirely clean, because -occasionally- the inventory fluctuates upward, either because of re-stocking, or because of returns. 数据并不完全干净,因为(有时)库存会因重新库存或退货而向上波动。 So for example, you might see something like 因此,例如,您可能会看到类似

MONDAY: 30 in stock.
TUESDAY: 20 in stock.
WEDNESDAY: 50 in stock.
THURSDAY: 40 in stock.
FRIDAY: 41 in stock.

Indicating that, between Tuesday and wednesday, 30 more were replenished, and on thursday, one was returned. 表明在星期二至星期三之间,又补充了30枚,星期四又退还了一枚。

I am planning to use mean and standard deviation on Daily sales for given rank. 我计划在给定排名的每日销售额上使用均值和标准差。 So if any rank given I can predict the daily sales based on mean and standard deviation values. 因此,如果给出任何排名,我可以根据均值和标准差值预测每日销售额。 Is this correct approach? 这是正确的方法吗? IS there any better approach for this scenario 在这种情况下是否有更好的方法

Sounds like this could be a good read for you, fpp 听起来像这样对您来说很不错, fpp

It provides an introduction to timeseries forecasting. 它介绍了时间序列预测。 Timeseries forecasting has a lot of nuance so it can trip people up pretty easily. 时间序列预测有很多细微差别,因此它可以使人们很容易绊倒。 Some of the issues you have already noted (eg seasonality). 您已经注意到的一些问题(例如季节性)。 Others pertain to the statistical properties of such series of data. 其他与此类数据的统计属性有关。 Take a look through this and 看看这个,

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM