简体   繁体   English

使用SQL以时间序列计算时间间隔

[英]Time interval calculation in time series using SQL

I have a MySQL table like this 我有一个像这样的MySQL表

CREATE TABLE IF NOT EXISTS `vals` (
  `DT` datetime NOT NULL,
  `value` INT(11) NOT NULL,
  PRIMARY KEY (`DT`)
);

the DT is unique date with time DT是唯一的日期和时间

data sample: 数据样本:

INSERT INTO `vals` (`DT`,`value`) VALUES
('2011-02-05 06:05:00', 300),
('2011-02-05 11:05:00', 250),
('2011-02-05 14:35:00', 145),
('2011-02-05 16:45:00', 100),
('2011-02-05 18:50:00', 125),
('2011-02-05 19:25:00', 100),
('2011-02-05 21:10:00', 125),
('2011-02-06 00:30:00', 150);

I need to get something like this: 我需要得到这样的东西:

start|end|value
NULL,'2011-02-05 06:05:00',300
'2011-02-05 06:05:00','2011-02-05 11:05:00',250
'2011-02-05 11:05:00','2011-02-05 14:35:00',145
'2011-02-05 14:35:00','2011-02-05 16:45:00',100
'2011-02-05 16:45:00','2011-02-05 18:50:00',125
'2011-02-05 18:50:00','2011-02-05 19:25:00',100
'2011-02-05 19:25:00','2011-02-05 21:10:00',125
'2011-02-05 21:10:00','2011-02-06 00:30:00',150
'2011-02-06 00:30:00',NULL,NULL

I tried the following query: 我尝试了以下查询:

SELECT T1.DT AS `start`,T2.DT AS `stop`, T2.value AS value FROM (
  SELECT DT FROM vals
) T1
LEFT JOIN (
  SELECT DT,value FROM  vals
) T2
ON T2.DT > T1.DT ORDER BY T1.DT ASC

but it returns to many rows (29 instead of 9) in result and I cold not find any way to limit this using SQL. 但它返回到结果中的许多行(29而不是9),我冷却没有找到任何方法来限制使用SQL。 Is it Possible in MySQL? 在MySQL中可能吗?

Use a subquery 使用子查询

SELECT
  (
     select max(T1.DT)
     from vals T1
     where T1.DT < T2.DT
  ) AS `start`,
  T2.DT AS `stop`,
  T2.value AS value
FROM vals T2
ORDER BY T2.DT ASC

You can also use a MySQL specific solution employing variables 您还可以使用采用变量的MySQL特定解决方案

SELECT CAST( @dt AS DATETIME ) AS `start` , @dt := DT AS `stop` , `value` 
FROM (SELECT @dt := NULL) dt, vals
ORDER BY dt ASC

But you need to do it precisely 但你需要准确地做到这一点

  • the ORDER by must be present otherwise the variables don't roll properly ORDER by必须存在,否则变量不能正确滚动
  • the variable needs to be NULLified within the query using a subquery to set it, otherwise if you run it twice in a row, the 2nd time it will not start with NULL 变量需要在查询中使用子查询来设置它,否则如果你连续运行两次,第二次它不会以NULL开头

You can use a server-side variable to simulate it: 您可以使用服务器端变量来模拟它:

select @myvar as start, end, value, @myvar := end as next_rows_start
from vals

Variables are interpreted from left-right in sequence, so the two references to @myvar (start and next_rows_start) will output with two different values. 变量从左到右依次解释,因此对@myvar(start和next_rows_start)的两个引用将输出两个不同的值。

Just remember to reset @myvar to null before and/or after the query, otherwise the second and subsequent runs will have a wrong first row: 只需记住在查询之前和/或之后将@myvar重置为null,否则第二次和后续运行将有一个错误的第一行:

select @myvar := null

This would be easier if the table had a running ID column which corresponds to the times in DT (same order). 如果表具有与DT中的时间(相同顺序)相对应的运行ID列,则这将更容易。 If you don't want to change the table you can use a temp: 如果您不想更改表,可以使用temp:

drop table if exists temp;

CREATE TABLE temp (
  `id` INT(11) AUTO_INCREMENT,
  `DT` datetime NOT NULL,
  `value` INT(11) NOT NULL,
  PRIMARY KEY (`id`)
);

insert into temp (DT,value) select * from vals order by DT asc;

select t1.DT as `start`, t2.DT as `end`, t2.value 
from  temp t2
left join temp t1 ON t2.id = t1.id + 1;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM