简体   繁体   English

在MySQL中使用BETWEEN的执行时间很长

[英]Long execution time with BETWEEN in MySQL

Execution time for this query is more then 2 seconds (for 10k rows). 此查询的执行时间超过2秒(对于10k行)。 Is it possible to optimize this query? 是否可以优化此查询?

SELECT id, MIN(ABS(timestamp_a - timestamp_b))
FROM a 
  INNER JOIN b ON ( timestamp_a  between (timestamp_b - 5 * 60) 
              AND (timestmap_b + 5 * 60) )
GROUP BY id

Sample result (id, timestamp_a, timestamp_b, diff): 示例结果(id,timestamp_a,timestamp_b,diff):

1   1349878538  1349878539  1
2   1349878679  1349878539  2
3   1349878724  1349878539  1
5   1349878836  1349878539  1
6   1349878890  1349878641  1

Table a 表a

CREATE TABLE `a` (
`id`  int(11) NOT NULL AUTO_INCREMENT ,
`timestamp_a`  bigint(20) NULL DEFAULT NULL ,
PRIMARY KEY (`id`),
INDEX `a` (`timestamp_a`) USING BTREE 
)

Table b 表b

CREATE TABLE `b` (
`id`  int(11) NOT NULL AUTO_INCREMENT ,
`timestamp_b`  bigint(20) NULL DEFAULT NULL ,
PRIMARY KEY (`id`),
INDEX `b` (`timestamp_b`) USING BTREE 
)

Both table is not related between - I search for records from table 'a' which are between timestamp in table 'b'. 两个表之间没有相关性 - 我搜索表'a'中的记录,这些记录位于表'b'中的时间戳之间。

EDIT: simples solution (run very fast): 编辑:简单的解决方案(运行速度非常快):

SELECT id, MIN(ABS(timestamp_a - timestamp_b))
FROM (SELECT id, timestamp, (timestamp - 5 * 60) timestamp_a, (timestamp + 5 * 60) timestamp_b) a
INNER JOIN b ON ( timestamp between timestamp_a AND timestamp_b )
GROUP BY id

Taking Michael's conventions for the modified timestamp columns, this query will produce the intended results of the original query with the performance of the "faster" query above: 将Michael的约定用于修改后的时间戳列,此查询将生成原始查询的预期结果,并具有上述“更快”查询的性能:

SELECT a.id, MIN(ABS(a.timestamp_a - tmp_b.timestamp_b))
FROM (SELECT id, timestamp_b, (timestamp_b - 5 * 60) timestamp_b_minus, (timestamp_b + 5 * 60) timestamp_b_plus) tmp_b
INNER JOIN a ON ( a.timestamp_a between tmp_b.timestamp_b_minus AND tmp_b.timestamp_b_plus )
GROUP BY a.id

The reason that the original query is experience performance constraints is that the RDBMS is forced to perform a full table scan of b for every row in a due to the formula used in the ON clause. 究其原因,原来的查询体验性能限制是,RDBMS被强制执行全表扫描b在每行a ,由于ON子句中使用的公式。

Even though the "faster" query requires a full table scan of b to generate the "temporarily" table tmp_b it is able to use the index on a.timestamp_a to extract the appropriate values from a based on the criteria: tmp_b.timestamp_b_minus AND tmp_b.timestamp_b_plus 即使“更快”的查询需要b的全表扫描来生成“临时”表tmp_b它也能够使用a.timestamp_a上的索引从a提取适当的值,基于以下标准: tmp_b.timestamp_b_minus AND tmp_b .timestamp_b_plus

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM