简体   繁体   English

MySQL - 按时间获取最后一个条目(或者如果为空则首先获取)

[英]MySQL - Get Last Entry by Time (or first if Null)

I have the following SQL Server 2008 query: 我有以下SQL Server 2008查询:

SELECT T.*,Data.Value FROM [Table] T OUTER APPLY      

(SELECT TOP 1 E.Value FROM [Table2] E     
ORDER BY CASE WHEN T.TDateTime >= E.EDateTime then 1 else 2 end,
ABS(DateDiff(ss,T.TDateTime,E.EDatetime))) AS Data

This basically gets the Last E value for every record in T , but if the record in T is before the first record in E , then it gets the first record in E . 这基本上得到T每条记录的Last E值,但是如果T中的记录在E的第一条记录之前,那么它将获得E的第一条记录。

What is the equivalent in MySQL? MySQL中的等价物是什么?

EDIT 编辑


Here is my schema and data: 这是我的架构和数据:

DROP TABLE IF EXISTS `data`;
CREATE TABLE `data` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `DataDateTime` datetime DEFAULT NULL,
  `Value` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;


LOCK TABLES `data` WRITE;
INSERT INTO `data` VALUES (1,'2012-02-01 00:00:00',1),(2,'2012-03-01 01:00:00',2),(3,'2012-04-01 02:00:00',3),(4,'2012-05-01 03:00:00',4),(5,'2012-06-01 04:00:00',5),(6,'2012-07-01 05:00:00',6),(7,'2012-08-01 06:00:00',7),(8,'2012-09-01 07:00:00',8);
UNLOCK TABLES;


DROP TABLE IF EXISTS `t`;
CREATE TABLE `t` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `TDateTime` datetime DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;

LOCK TABLES `t` WRITE;
INSERT INTO `t` VALUES (1,'2012-01-01 00:00:00'),(2,'2012-02-01 00:00:00'),(3,'2012-02-01 12:00:00'),(4,'2012-03-01 00:00:00'),(5,'2012-04-01 00:00:00'),(6,'2012-05-01 12:00:00'),(7,'2012-06-01 00:00:00'), (8,'2012-07-01 00:00:00');
UNLOCK TABLES;

SQLFiddle: SQLFiddle:

CREATE TABLE `data` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `DataDateTime` datetime DEFAULT NULL,
  `Value` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
);

INSERT INTO `data` (`DataDateTime`, `Value`) VALUES 
('2012-02-01 00:00:00',1),
('2012-03-01 01:00:00',2),
('2012-04-01 02:00:00',3),
('2012-05-01 03:00:00',4),
('2012-06-01 04:00:00',5),
('2012-07-01 05:00:00',6),
('2012-08-01 06:00:00',7),
('2012-09-01 07:00:00',8);


CREATE TABLE `t` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `TDateTime` datetime DEFAULT NULL,
  PRIMARY KEY (`ID`)
);

INSERT INTO `t` (`TDateTime`) VALUES 
('2012-01-01 00:00:00'),
('2012-02-01 00:00:00'),
('2012-02-01 12:00:00'),
('2012-03-01 00:00:00'),
('2012-04-01 00:00:00'),
('2012-05-01 12:00:00'),
('2012-06-01 00:00:00'),
('2012-07-01 00:00:00');

My Desired Output: 我想要的输出:

T.ID, T.TDateTime, Data.DataDateTime, Data.Value
1, 2012-01-01 00:00:00, 2012-02-01 00:00:00, 1
2, 2012-02-01 00:00:00, 2012-02-01 00:00:00, 1
3, 2012-02-01 12:00:00, 2012-02-01 00:00:00, 1
4, 2012-03-01 00:00:00, 2012-02-01 00:00:00, 1
5, 2012-04-01 00:00:00, 2012-03-01 01:00:00, 2
6, 2012-05-01 12:00:00, 2012-05-01 03:00:00, 4
7, 2012-06-01 00:00:00, 2012-05-01 03:00:00, 4
8, 2012-07-01 00:00:00, 2012-06-01 04:00:00, 5

Here's my submission :) 这是我的提交:)

select *, if(Segment1Time<=ifnull(Segment2Time,Segment1Time),
             Segment1Value,
             Segment2Value) Value
from
(
  select *,
    (select DataDateTime from `data` where DataDateTime<=t.TDateTime order by DataDateTime desc limit 1) Segment1Time,
    (select Value from `data` where DataDateTime<=t.TDateTime order by DataDateTime desc limit 1) Segment1Value,
    (select DataDateTime from `data` where DataDateTime> t.TDateTime order by DataDateTime limit 1) Segment2Time,
    (select Value from `data` where DataDateTime> t.TDateTime order by DataDateTime limit 1) Segment2Value
  from `t` t
) X
order by tdatetime;

And here's the EXPLAIN for the query. 这是查询的EXPLAIN。 The good thing about this query is that with an index on data.datadatetime , it is linear with the 4 subqueries all producing single SEEKs instead of having to go through all records to rank them. 这个查询的好处是,对于data.datadatetime的索引,它与4个子查询呈线性关系,所有子查询都生成单个SEEK,而不必通过所有记录对它们进行排名。 In theory, it should work better the larger the data table becomes. 理论上, data表变得越大,它应该工作得越好。

ID  SELECT_TYPE TABLE   TYPE    POSSIBLE_KEYS   KEY KEY_LEN REF ROWS    EXTRA
1   PRIMARY <derived2>  ALL (null)  (null)  (null)  (null)  8   Using filesort
2   DERIVED t   ALL (null)  (null)  (null)  (null)  8   
6   DEPENDENT SUBQUERY  data    ALL DataDateTime    (null)  (null)  (null)  8   Using where; Using filesort
5   DEPENDENT SUBQUERY  data    index   DataDateTime    DataDateTime    9   (null)  1   Using where; Using index
4   DEPENDENT SUBQUERY  data    ALL DataDateTime    (null)  (null)  (null)  8   Using where; Using filesort
3   DEPENDENT SUBQUERY  data    index   DataDateTime    DataDateTime    9   (null)  1   Using where; Using index

Took me a while to understand the requirements. 花了一些时间来了解这些要求。 In the end started analyzing your base query with the OUTER APPLY and changed it into this : 最后开始用OUTER APPLY分析您的基本查询并将其更改为:

SELECT t.*, data.*
  FROM t
  JOIN (SELECT t_ID = t.ID, 
        data_ID = ISNULL((SELECT TOP 1 ID FROM data WHERE data.DataDateTime <= t.TDateTime ORDER BY DataDateTime DESC),
                         (SELECT TOP 1 ID FROM data WHERE data.DataDateTime > t.TDateTime ORDER BY DataDateTime ASC))
   FROM t) lnk
    ON lnk.t_ID = t.ID
  JOIN data
    ON data.ID = lnk.data_ID
  ORDER BY t.ID

Execution plan shows it as less efficient though, which kind of surprised me. 执行计划显示效率较低,但令我感到惊讶的是。 However, adding an index on DataDateTime changed that dramatically something that might come in handy on your MSSQL version ?! 但是,在DataDateTime上添加索引会大大改变您的MSSQL版本可能派上用场的东西吗?

Anyway, starting from here I created this in MySQL : 无论如何,从这里开始我在MySQL中创建了这个:

SELECT t.*, data.*
  FROM t
  JOIN (SELECT t.ID t_ID, 
           COALESCE((SELECT ID FROM data WHERE data.DataDateTime <= t.TDateTime ORDER BY DataDateTime DESC LIMIT 1),
                           (SELECT ID FROM data WHERE data.DataDateTime > t.TDateTime ORDER BY DataDateTime ASC LIMIT 1)) data_ID
   FROM t) lnk
    ON lnk.t_ID = t.ID
  JOIN data
    ON data.ID = lnk.data_ID
  ORDER BY t.ID

Seems to do what it's expected to do... 好像做了预期的事......

You could cross join t and data , then rank data rows for every t row based on (an equivalent of) the ORDER BY in your SQL Server query. 您可以交叉连接tdata ,然后根据(相当于)SQL Server查询中的ORDER BY对每个t行排序data行。

The DATEDIFF(ss, dt1, dt2) part could be replaced with UNIX_TIMESTAMP(dt2) - UNIX_TIMESTAMP(dt1) . DATEDIFF(ss, dt1, dt2)部分可以替换为UNIX_TIMESTAMP(dt2) - UNIX_TIMESTAMP(dt1) Ranking could be implemented using variables. 排名可以使用变量来实现。 Here's my attempt at a solution: 这是我尝试解决方案:

SELECT
  ID,
  TDateTime,
  DataDateTime,
  Value
FROM (
  SELECT
    ID,
    TDateTime,
    DataDateTime,
    Value,
    @rnk := @rnk * (@lastid = ID) + 1 AS rnk,
    @lastid := ID
  FROM (
    SELECT
      t.ID,
      t.TDateTime,
      data.DataDateTime,
      data.Value
    FROM
      t CROSS JOIN data,
      (SELECT @lastid := 0, @rnk := 0) s
    ORDER BY
      t.ID,
      (t.TDateTime >= data.DataDateTime) DESC,
      ABS(UNIX_TIMESTAMP(t.TDateTime) - UNIX_TIMESTAMP(data.DataDateTime))
  ) s
) s
WHERE
  rnk = 1
;

You can find a working demo at SQL Fiddle . 你可以在SQL Fiddle找到一个工作演示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM