简体   繁体   English

为时间序列数据MySQL创建小时组

[英]Creating hour groups for time series data MySQL

I have a MySQL database with data recorded every 15 minutes. 我有一个MySQL数据库,每15分钟记录一次数据。 For simplicity, lets assume there are 2 fields: 为简单起见,我们假设有两个字段:

DATETIME Created
Double Value

I would like to draw a chart which needs for each hour the opening, min, max, and closing values for an hour. 我想绘制一个图表,每小时需要一小时的开盘价,最小值,最大值和收盘价。 To do this I need to return results from my MySQL query to my PHP to create a JSON. 为此,我需要将MySQL查询的结果返回给我的PHP来创建一个JSON。 I would like to do this in the MySQL query so that the response is cached. 我想在MySQL查询中执行此操作,以便缓存响应。

Here is an example of the problem, given 9 data points trying to get 2 hour groups: 这是一个问题的例子,给出了9个数据点试图获得2小时组:

Creation            Value
2014-03-25 12:15:00 413.17011
2014-03-25 12:00:00 414
2014-03-25 11:45:00 415
2014-03-25 11:30:00 415
2014-03-25 11:15:00 415.5
2014-03-25 11:00:00 415.5
2014-03-25 10:45:00 416
2014-03-25 10:30:00 416
2014-03-25 10:15:00 415.99

I would need: 我会需要:

Hour 1 (11:15:00 to 12:15:00)
Open: 415.5
Close: 413.17011
High: 415.5
Low: 413.17011

Hour 2 (10:15:00 to 11:15:00)
Open: 415.99
Close: 415.5
High: 416
Low: 415.5

Of course for the full 24 hours this would need repeating, this is just an example. 当然,整整24小时都需要重复,这只是一个例子。 Any help is really appreciated! 任何帮助真的很感激!

Here is the current MySQL dump for the example (Using MySQL version 2.6.4-pl3): 以下是该示例的当前MySQL转储(使用MySQL版本2.6.4-pl3):

-- 
-- Table structure for table `exampleTable`
-- 

CREATE TABLE `exampleTable` (
  `created` datetime NOT NULL,
  `value` double NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci;

-- 
-- Dumping data for table `exampleTable`
-- 

INSERT INTO `exampleTable` VALUES ('2014-03-25 12:15:00', 413.17011);
INSERT INTO `exampleTable` VALUES ('2014-03-25 12:00:00', 414);
INSERT INTO `exampleTable` VALUES ('2014-03-25 11:45:00', 415);
INSERT INTO `exampleTable` VALUES ('2014-03-25 11:30:00', 415);
INSERT INTO `exampleTable` VALUES ('2014-03-25 11:15:00', 415.5);
INSERT INTO `exampleTable` VALUES ('2014-03-25 11:00:00', 415.5);
INSERT INTO `exampleTable` VALUES ('2014-03-25 10:45:00', 416);
INSERT INTO `exampleTable` VALUES ('2014-03-25 10:30:00', 416);
INSERT INTO `exampleTable` VALUES ('2014-03-25 10:15:00', 415.99);

Get it to work 让它工作

You might try 你可以试试

SELECT
 DATE(created) AS day,
 HOUR(created) AS hour,
 (
   SELECT Value FROM `table` AS b
   WHERE DATE(a.created) = DATE(b.created)
     AND HOUR(a.created) = HOUR(b.created)
     ORDER BY created ASC LIMIT 1
 ) AS Open,
 (
   SELECT Value FROM `table` AS b
   WHERE DATE(a.created) = DATE(b.created)
     AND HOUR(a.created) = HOUR(b.created)
     ORDER BY created DESC LIMIT 1
 ) AS Close,
 MIN(value) AS Low,
 MAX(value) AS High
FROM `table` AS a
GROUP BY DATE(created), HOUR(created)

this groups all your rows by DATE+HOUR and computes the MIN respectively MAX as Low or High . 这会将所有行分组为DATE + HOUR,并将MIN或MAX分别计算为LowHigh To find the first and last row for Open and Close , the easiest in SQL syntax is a subselect. 要查找“ Open和“ Close的第一行和最后一行,SQL语法中最简单的是子选择。 It selects all rows which are relevant for the current row, and sorts them ascending or descending. 它选择与当前行相关的所有行,并按升序或降序对其进行排序。 Then selects the first row. 然后选择第一行。

Please consider that this groups only by hour. 请考虑这个组只有一小时。 Instead of 代替

Hour 1 (11:15:00 to 12:15:00)
Hour 2 (10:15:00 to 11:15:00)

this groups like 这群人喜欢

Hour 1 (11:00:00 to 11:59:00)
Hour 2 (10:00:00 to 10:59:00)

If you want to keep the 15 minutes offset, you may subtract this from your created timestamp ( created - INTERVAL 15 MINUTE ) at all occurrences of created in the sql query above. 如果要保持15分钟的偏移量,可以在上面的sql查询中created所有事件中从创建的时间戳( created - INTERVAL 15 MINUTE )中减去此值。

I created a working sqlfiddle for you. 我为你创建了一个工作方式

Performance 性能

Just as hint: If you can, you might want to split date and time into two columns (of types date and time ). 正如提示:如果可以,您可能希望将日期和时间分成两列( datetime类型)。 This way you do not need to cast DATE() on created everytime, but can use the new date column instead. 这样,您不需要每次都created DATE() ,但可以使用新的日期列。 You can then add a combined index to this new columns too, which speeds up your query. 然后,您也可以为这些新列添加组合索引,从而加快查询速度。 See this sqlfiddle for an example. 请参阅此sqlfiddle以获取示例。

To get your grouping right, you can use 为了使您的分组正确,您可以使用

 FLOOR(( UNIX_TIMESTAMP(myTable.dateCreated) - 900 ) / 3600)

where 3600 sets the interval at 1 hour and the - 900 sets the offset at 00:15 其中3600设置间隔为1小时, - 900设置偏移为00:15

Since you need the MIN() and MAX for each of your four values, you'll need to JOIN the main table to itself but grouped by the min or max (based on the column). 由于四个值中的每一个都需要MIN()和MAX,因此您需要将主表连接到自身,但需要按最小值或最大值(基于列)进行分组。

finally, you have each sub-query (joined table) calculate the grouping hour above so you can use that to join them. 最后,您有每个子查询(连接表)计算上面的分组小时,以便您可以使用它来加入它们。 Here's what I cam up with (with slightly different column names and 这就是我所提到的(略有不同的列名和

SELECT openDate,Open,Close,High,Low 
FROM   (SELECT FLOOR(( UNIX_TIMESTAMP(myTable.dateCreated) - 900 ) / 3600) 
               AS 
                      theHour, 
                      myTable.value AS Open,myTable.dateCreated openDate 
        FROM   myTable 
               JOIN (SELECT value,MIN(dateCreated) AS dateCreated 
                     FROM   myTable 
                     GROUP  BY FLOOR(( UNIX_TIMESTAMP(dateCreated) - 900 ) 
                                     / 3600) 
                    ) AS 
                                    aggTable 
                 ON aggTable.dateCreated = myTable.dateCreated) AS 
       openTable 
       LEFT JOIN (SELECT FLOOR(( UNIX_TIMESTAMP(myTable.dateCreated) - 900 
                               ) / 
                               3600) AS 
       theHour 
       , 
       myTable.value AS Close,myTable.dateCreated closeDate 
       FROM   myTable 
       JOIN (SELECT value,MAX(dateCreated) AS dateCreated 
       FROM   myTable 
       GROUP  BY FLOOR(( UNIX_TIMESTAMP(dateCreated) - 900 ) / 3600) 
       ) AS 
       aggTable 
       ON aggTable.dateCreated = myTable.dateCreated) AS closeTable 
              ON openTable.theHour = closeTable.theHour 
       LEFT JOIN (SELECT 
                                                        FLOOR(( 
                 UNIX_TIMESTAMP(myTable.dateCreated) - 900 ) / 3600) AS 
                   theHour, 
                                                          MAX( 
                                                                  value) 
                 AS High 
                  FROM   myTable 
                  GROUP  BY theHour) AS highTable 
              ON closeTable.theHour = highTable.theHour 
       LEFT JOIN (SELECT 
                                                        FLOOR(( 
                 UNIX_TIMESTAMP(myTable.dateCreated) - 900 ) / 3600) AS 
                   theHour, 
                                                          MIN( 
                                                                  value) 
                 AS Low 
                  FROM   myTable 
                  GROUP  BY theHour) AS lowTable 
              ON highTable.theHour = lowTable.theHour 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM