简体   繁体   中英

MySQL - Get Last Entry by Time (or first if Null)

I have the following SQL Server 2008 query:

SELECT T.*,Data.Value FROM [Table] T OUTER APPLY      

(SELECT TOP 1 E.Value FROM [Table2] E     
ORDER BY CASE WHEN T.TDateTime >= E.EDateTime then 1 else 2 end,
ABS(DateDiff(ss,T.TDateTime,E.EDatetime))) AS Data

This basically gets the Last E value for every record in T , but if the record in T is before the first record in E , then it gets the first record in E .

What is the equivalent in MySQL?

EDIT


Here is my schema and data:

DROP TABLE IF EXISTS `data`;
CREATE TABLE `data` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `DataDateTime` datetime DEFAULT NULL,
  `Value` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;


LOCK TABLES `data` WRITE;
INSERT INTO `data` VALUES (1,'2012-02-01 00:00:00',1),(2,'2012-03-01 01:00:00',2),(3,'2012-04-01 02:00:00',3),(4,'2012-05-01 03:00:00',4),(5,'2012-06-01 04:00:00',5),(6,'2012-07-01 05:00:00',6),(7,'2012-08-01 06:00:00',7),(8,'2012-09-01 07:00:00',8);
UNLOCK TABLES;


DROP TABLE IF EXISTS `t`;
CREATE TABLE `t` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `TDateTime` datetime DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;

LOCK TABLES `t` WRITE;
INSERT INTO `t` VALUES (1,'2012-01-01 00:00:00'),(2,'2012-02-01 00:00:00'),(3,'2012-02-01 12:00:00'),(4,'2012-03-01 00:00:00'),(5,'2012-04-01 00:00:00'),(6,'2012-05-01 12:00:00'),(7,'2012-06-01 00:00:00'), (8,'2012-07-01 00:00:00');
UNLOCK TABLES;

SQLFiddle:

CREATE TABLE `data` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `DataDateTime` datetime DEFAULT NULL,
  `Value` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
);

INSERT INTO `data` (`DataDateTime`, `Value`) VALUES 
('2012-02-01 00:00:00',1),
('2012-03-01 01:00:00',2),
('2012-04-01 02:00:00',3),
('2012-05-01 03:00:00',4),
('2012-06-01 04:00:00',5),
('2012-07-01 05:00:00',6),
('2012-08-01 06:00:00',7),
('2012-09-01 07:00:00',8);


CREATE TABLE `t` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `TDateTime` datetime DEFAULT NULL,
  PRIMARY KEY (`ID`)
);

INSERT INTO `t` (`TDateTime`) VALUES 
('2012-01-01 00:00:00'),
('2012-02-01 00:00:00'),
('2012-02-01 12:00:00'),
('2012-03-01 00:00:00'),
('2012-04-01 00:00:00'),
('2012-05-01 12:00:00'),
('2012-06-01 00:00:00'),
('2012-07-01 00:00:00');

My Desired Output:

T.ID, T.TDateTime, Data.DataDateTime, Data.Value
1, 2012-01-01 00:00:00, 2012-02-01 00:00:00, 1
2, 2012-02-01 00:00:00, 2012-02-01 00:00:00, 1
3, 2012-02-01 12:00:00, 2012-02-01 00:00:00, 1
4, 2012-03-01 00:00:00, 2012-02-01 00:00:00, 1
5, 2012-04-01 00:00:00, 2012-03-01 01:00:00, 2
6, 2012-05-01 12:00:00, 2012-05-01 03:00:00, 4
7, 2012-06-01 00:00:00, 2012-05-01 03:00:00, 4
8, 2012-07-01 00:00:00, 2012-06-01 04:00:00, 5

Here's my submission :)

select *, if(Segment1Time<=ifnull(Segment2Time,Segment1Time),
             Segment1Value,
             Segment2Value) Value
from
(
  select *,
    (select DataDateTime from `data` where DataDateTime<=t.TDateTime order by DataDateTime desc limit 1) Segment1Time,
    (select Value from `data` where DataDateTime<=t.TDateTime order by DataDateTime desc limit 1) Segment1Value,
    (select DataDateTime from `data` where DataDateTime> t.TDateTime order by DataDateTime limit 1) Segment2Time,
    (select Value from `data` where DataDateTime> t.TDateTime order by DataDateTime limit 1) Segment2Value
  from `t` t
) X
order by tdatetime;

And here's the EXPLAIN for the query. The good thing about this query is that with an index on data.datadatetime , it is linear with the 4 subqueries all producing single SEEKs instead of having to go through all records to rank them. In theory, it should work better the larger the data table becomes.

ID  SELECT_TYPE TABLE   TYPE    POSSIBLE_KEYS   KEY KEY_LEN REF ROWS    EXTRA
1   PRIMARY <derived2>  ALL (null)  (null)  (null)  (null)  8   Using filesort
2   DERIVED t   ALL (null)  (null)  (null)  (null)  8   
6   DEPENDENT SUBQUERY  data    ALL DataDateTime    (null)  (null)  (null)  8   Using where; Using filesort
5   DEPENDENT SUBQUERY  data    index   DataDateTime    DataDateTime    9   (null)  1   Using where; Using index
4   DEPENDENT SUBQUERY  data    ALL DataDateTime    (null)  (null)  (null)  8   Using where; Using filesort
3   DEPENDENT SUBQUERY  data    index   DataDateTime    DataDateTime    9   (null)  1   Using where; Using index

Took me a while to understand the requirements. In the end started analyzing your base query with the OUTER APPLY and changed it into this :

SELECT t.*, data.*
  FROM t
  JOIN (SELECT t_ID = t.ID, 
        data_ID = ISNULL((SELECT TOP 1 ID FROM data WHERE data.DataDateTime <= t.TDateTime ORDER BY DataDateTime DESC),
                         (SELECT TOP 1 ID FROM data WHERE data.DataDateTime > t.TDateTime ORDER BY DataDateTime ASC))
   FROM t) lnk
    ON lnk.t_ID = t.ID
  JOIN data
    ON data.ID = lnk.data_ID
  ORDER BY t.ID

Execution plan shows it as less efficient though, which kind of surprised me. However, adding an index on DataDateTime changed that dramatically something that might come in handy on your MSSQL version ?!

Anyway, starting from here I created this in MySQL :

SELECT t.*, data.*
  FROM t
  JOIN (SELECT t.ID t_ID, 
           COALESCE((SELECT ID FROM data WHERE data.DataDateTime <= t.TDateTime ORDER BY DataDateTime DESC LIMIT 1),
                           (SELECT ID FROM data WHERE data.DataDateTime > t.TDateTime ORDER BY DataDateTime ASC LIMIT 1)) data_ID
   FROM t) lnk
    ON lnk.t_ID = t.ID
  JOIN data
    ON data.ID = lnk.data_ID
  ORDER BY t.ID

Seems to do what it's expected to do...

You could cross join t and data , then rank data rows for every t row based on (an equivalent of) the ORDER BY in your SQL Server query.

The DATEDIFF(ss, dt1, dt2) part could be replaced with UNIX_TIMESTAMP(dt2) - UNIX_TIMESTAMP(dt1) . Ranking could be implemented using variables. Here's my attempt at a solution:

SELECT
  ID,
  TDateTime,
  DataDateTime,
  Value
FROM (
  SELECT
    ID,
    TDateTime,
    DataDateTime,
    Value,
    @rnk := @rnk * (@lastid = ID) + 1 AS rnk,
    @lastid := ID
  FROM (
    SELECT
      t.ID,
      t.TDateTime,
      data.DataDateTime,
      data.Value
    FROM
      t CROSS JOIN data,
      (SELECT @lastid := 0, @rnk := 0) s
    ORDER BY
      t.ID,
      (t.TDateTime >= data.DataDateTime) DESC,
      ABS(UNIX_TIMESTAMP(t.TDateTime) - UNIX_TIMESTAMP(data.DataDateTime))
  ) s
) s
WHERE
  rnk = 1
;

You can find a working demo at SQL Fiddle .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM