简体   繁体   English

MySQL-连接两个带有Datetime列的表和Date之前的三个最新条目

[英]MySQL - Joining two tables with datetime columns on Date and three most recent entries before that date

I have two tables in my SQL 我的SQL中有两个表

For example Table1 - ItemPrice : 例如表1-ItemPrice

DATETIME   |  ITEM | PRICE
2011-08-28 |   ABC   123 
2011-09-01 |   ABC   125
2011-09-02 |   ABC   124 
2011-09-03 |   ABC   127 
2011-09-04 |   ABC   126

Table2 - DayScore : 表2-DayScore

DATETIME   |  ITEM | SCORE
2011-08-28 |   ABC    1
2011-08-29 |   ABC    8
2011-09-01 |   ABC    4
2011-09-02 |   ABC    2
2011-09-03 |   ABC    7  
2011-09-04 |   ABC    3

I want to write a query, which given a item ID (eg ABC ), will return the price at that date from ItemPrice (of there is no price for that date then the query should not return anything). 我想编写一个查询,给定一个商品ID(例如ABC ),它将从ItemPrice返回该日期的价格(该日期没有价格,则查询不应返回任何内容)。 If a valid price is found for the query date, the query should return (in 9 columns) 如果找到查询日期的有效价格,则查询应返回(9列)

  • the price of the item from ItemPrice for the past three days (ie the most recent 3 prices before the date queried). 过去三天内来自ItemPrice的商品价格(即查询日期之前的最新3个价格)。
  • In the next three columns it should return, from DayScore , the matching score for those 3 dates selected from ItemPrice. 在接下来的三列中,它应该从DayScore返回从DayScore选择的那三个日期的匹配分数。
  • Finally the dates (t-1 to t-3) selected 最后选择日期(t-1至t-3)

In otherwords the results for this query looking at just date='2011-09-03' as an example for item='abc' would return: 换句话说,此查询的结果仅以date ='2011-09-03'为例(例如item ='abc')将返回:

DATE      |  ITEM  |  PRICE  |  SCR  | PRC_t-1 | PRC_t-2 | PRC_t-3 | SCR_t-1 | SCR_t-2 | SCR_t-3 | DATE_t-1  | DATE_t-2  | DATE_t-3 
2011-09-03|  ABC   |  127    |  7    | 124     | 125     | 123     | 2       | 4       | 1       | 2011-09-02| 2011-09-01| 2011-08-28
....

Etc for each date that appears in ItemPrice table. 等等出现在ItemPrice表中的每个日期。

What is the neatest and most efficient way to run this query (as its something that will be run over many millions of rows)? 什么是最新颖,最有效的方法来运行此查询(因为它将在数百万行上运行)?

Cheers! 干杯!

Pretty no but it does produce the results. 完全没有,但是确实可以产生结果。 You could probably get rid of some subselects and make it a bit less sql but I tried to build it up in steps so you can deduct what it is doing. 您可能会摆脱一些子选择,使它的sql少一些,但我尝试逐步进行构建,以便可以推断出它在做什么。

The core part is this select: 核心部分是此选择:

SELECT 
  Sub2.*
, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
FROM
   (SELECT 
        Sub1.*
      , (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
    FROM
       (SELECT 
            ItemPrice.DateTime
          , (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1 
        From ItemPrice) Sub1
   ) Sub2

This returns a table with the dates (now, t-1, t-2, t-3). 这将返回一个带有日期的表(现在是t-1,t-2,t-3)。 From there is is simple joining with price and score for each of those dates. 从那里可以轻松地将每个日期的价格和分数连接起来。 The whole things including testdata the becomes this bulk of sql 包括testdata在内的所有内容都变成了这大部分sql

/*
CREATE TABLE ItemPrice (datetime Date, item varchar(3), price int);
CREATE TABLE DayScore ( datetime Date, item varchar(3), score int);

INSERT INTO ItemPrice VALUES ('20110828', 'ABC', 123);
INSERT INTO ItemPrice VALUES ('20110901', 'ABC', 125);
INSERT INTO ItemPrice VALUES ('20110902', 'ABC', 124);
INSERT INTO ItemPrice VALUES ('20110903', 'ABC', 127);
INSERT INTO ItemPrice VALUES ('20110904', 'ABC', 126);

INSERT INTO DayScore VALUES ('20110828', 'ABC', 1);
INSERT INTO DayScore VALUES ('20110829', 'ABC', 8);
INSERT INTO DayScore VALUES ('20110901', 'ABC', 4);
INSERT INTO DayScore VALUES ('20110902', 'ABC', 2);
INSERT INTO DayScore VALUES ('20110903', 'ABC', 7);
INSERT INTO DayScore VALUES ('20110904', 'ABC', 3);
*/

SELECT Hist.*, Current.Item, Current.Price, Current.Score
, Minus1.Price as PRC_1, Minus1.Score SCR_1
, Minus2.Price as PRC_2, Minus2.Score SCR_2
, Minus3.Price as PRC_3, Minus3.Score SCR_3
FROM 
    (SELECT Sub2.*, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
    FROM
        (SELECT Sub1.*, (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
        FROM
            (SELECT ItemPrice.DateTime, (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1 From ItemPrice) Sub1) Sub2) Hist 
INNER JOIN
    (SELECT ItemPrice.DateTime, ItemPrice.Item, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) CURRENT
ON (Current.DateTime = Hist.DateTime)        
LEFT JOIN 
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS1
ON (Minus1.DateTime = Hist.T_1)        
LEFT JOIN 
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS2
ON (Minus2.DateTime = Hist.T_2)        
LEFT JOIN 
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS3
ON (Minus3.DateTime = Hist.T_3)        
WHERE Current.Item = 'ABC'

;

/*
DROP TABLE ItemPrice;
DROP TABLE DayScore;
*/

I'm curious about your explain plan when you do this on 1M rows :) It might not even be that horrible if you have the right indexes which you probably do. 当您对1M行执行此操作时,我对您的解释计划感到好奇:)如果您具有正确的索引(可能会执行),甚至可能还不那么可怕。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM