简体   繁体   English

基于现有日期的时间预测:时间记录

[英]Time Prediction based on existing date:time records

I have a system that logs date:time and it returns results such as: 我有一个系统记录日期:时间,它返回如下结果:

05.28.2013 11:58pm
05.27.2013 10:20pm
05.26.2013 09:47pm
05.25.2013 07:30pm
05.24.2013 06:24pm
05.23.2013 05:36pm

What I would like to be able to do is have a list of date:time prediction for the next few days - so a person could see when the next event might occur. 我希望能够做的是有一个日期列表:接下来几天的时间预测 - 这样一个人就可以看到下一个事件可能发生的时间。

Example of prediction results: 预测结果示例:

06.01.2013 04:06pm
05.31.2013 03:29pm
05.30.2013 01:14pm

Thoughts on how to go about doing time prediction of this kind with php? 关于如何用php做这种时间预测的想法?

The basic answer is "no". 基本答案是“不”。 Programming tools are not designed to do prediction. 编程工具不是为预测而设计的。 Statistical tools are designed for that purpose. 统计工具是为此目的而设计的。 You should be thinking more about R, SPSS, SAS, or some other similar tool. 您应该更多地考虑R,SPSS,SAS或其他类似工具。 Some databases have rudimentary data analysis tools built-in, which is another (often inferior) option. 一些数据库内置了基本的数据分析工具,这是另一种(通常是次要的)选择。

The standard statistical technique for time-series prediction is called ARIMA analysis (auto-regressive integrated moving average). 时间序列预测的标准统计技术称为ARIMA分析(自回归积分移动平均值)。 It is unlikely that you are going to be implementing that in php/SQL. 你不太可能在php / SQL中实现它。 The standard statistical technique for estimating time between events is Poisson regression. 用于估计事件之间的时间的标准统计技术是泊松回归。 It is also highly unlikely that you are going to be implementing that in php/SQL. 你也不太可能在php / SQL中实现它。

I observe that your data points are once per day in the evening. 我发现你的数据点是每天晚上一次。 I might guess that this is the end of some process that runs during the day. 我猜可能这是白天运行的某个进程的结束。 The end time is based on the start time and the duration of the process. 结束时间基于开始时间和过程的持续时间。

What can you do? 你能做什么? Often a reasonable prediction is "what happened yesterday". 通常合理的预测是“昨天发生的事情”。 You would be surprised at how hard it is to beat this prediction for weather forecasting and for estimating the stock market. 你会惊讶地发现,在天气预报和估算股市方面,这个预测有多难。 Another very reasonable method is the average of historical values. 另一个非常合理的方法是历史值的平均值。

If you know something about your process, then an average by day of the week can work well. 如果您对过程有所了解,那么每周平均值可以很好地运行。 You can also get more sophisticated, and do Monte Carlo estimates, by measuring the average and standard deviation, and then pulling a random value from a statistical distribution. 您还可以通过测量平均值和标准偏差,然后从统计分布中提取随机值来获得更复杂的蒙特卡罗估计。 However, the average value would work just as well in your case. 但是,平均值在您的情况下也可以正常工作。

I would suggest that you study a bit about statistics/data mining/predictive analytics before attempting to do any "predictions". 我建议您在尝试进行任何“预测”之前先研究一下统计/数据挖掘/预测分析。 At the very least, if you really have a problem in this domain, you should be looking for the right tools to use. 至少,如果您在此域中确实遇到问题,那么您应该寻找合适的工具来使用。

As Gordon Linoff posted, the simple answer is "no", but you can write some code that will give a rough guess on what the next time will be. 正如Gordon Linoff所说,简单的回答是“不”,但你可以编写一些代码来粗略猜测下一次会是什么。

I wrote a very basic example on how to do this on my site http://livinglion.com/2013/05/next-occurrence-in-datetime-sequence/ 我在我的网站http://livinglion.com/2013/05/next-occurrence-in-datetime-sequence/上写了一个非常基本的例子。

Here's a possible way that this could be done, using PHP + MySQL: 使用PHP + MySQL可以实现这一点:

  • You can have a table with two fields: a DATE field and a TIME field (essentially storing the date + time portion separately). 您可以拥有一个包含两个字段的表:DATE字段和TIME字段(基本上分别存储日期+时间部分)。 Say that the table is named "timeData" and the fields are: 假设该表名为“timeData”,字段为:

  • eventDate: date eventDate:date

  • eventTime: time eventTime:时间

Your primary key would be the combination of eventDate and eventTime, so that they're never repeated as a pair. 您的主键是eventDate和eventTime的组合,因此它们永远不会作为一对重复。

Then, you can do a query like: 然后,您可以执行以下查询:

SELECT eventTime, count(*) as counter FROM timeData GROUP BY eventTime ORDER BY counter DESC LIMIT 0, 10

The aforementioned query will always return the first 10 most frequent event times, ordered by frequency. 上述查询将始终返回前10个最频繁的事件时间,按频率排序。 You can then order these again from smallest to largest. 然后,您可以从最小到最大再次订购这些。

This way, you can return quite accurate time prediction results, which will become even more accurate as you gather data each day 这样,您可以返回非常准确的时间预测结果,这会在您每天收集数据时变得更加准确

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM