繁体   English   中英

具有范围标准的连接表上的MySQL优化

[英]MySQL optimization on join tables with range criteria

我将通过在一个表中使用单个位置到另一个表中的范围(由两列表示)来连接两个表。

但是,性能太慢,大约20分钟。 我已经尝试在表上添加索引或更改查询。 但表现仍然很差。

所以,我要求优化加入速度。


以下是对MySQL的查询。

mysql> SELECT `inVar`.chrom, `inVar`.pos, `openChrom_K562`.score
    -> FROM `inVar`
    -> LEFT JOIN `openChrom_K562`
    -> ON (
    -> `inVar`.chrom=`openChrom_K562`.chrom AND
    -> `inVar`.pos BETWEEN `openChrom_K562`.chromStart AND `openChrom_K562`.chromEnd
    -> );

inVaropenChrom_K562是我使用的表。

inVar存储每行中的单个位置。

openChrom_K562存储chromStartchromEnd指示的范围信息。

inVar包含57902行, openChrom_K562分别包含137373行。


表格中的字段。

mysql> DESCRIBE inVar;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| chrom | varchar(31) | NO   | PRI | NULL    |       |
| pos   | int(10)     | NO   | PRI | NULL    |       |
+-------+-------------+------+-----+---------+-------+

mysql> DESCRIBE openChrom_K562;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| chrom      | varchar(31) | NO   | MUL | NULL    |       |
| chromStart | int(10)     | NO   | MUL | NULL    |       |
| chromEnd   | int(10)     | NO   |     | NULL    |       |
| score      | int(10)     | NO   |     | NULL    |       |
+------------+-------------+------+-----+---------+-------+

表中内置的索引

mysql> SHOW INDEX FROM inVar;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| inVar |          0 | PRIMARY  |            1 | chrom       | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| inVar |          0 | PRIMARY  |            2 | pos         | A         |       57902 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

mysql> SHOW INDEX FROM openChrom_K562;
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table          | Non_unique | Key_name    | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| openChrom_K562 |          1 | start_end   |            1 | chromStart  | A         |      137373 |     NULL | NULL   |      | BTREE      |         |
| openChrom_K562 |          1 | start_end   |            2 | chromEnd    | A         |      137373 |     NULL | NULL   |      | BTREE      |         |
| openChrom_K562 |          1 | chrom_only  |            1 | chrom       | A         |          22 |     NULL | NULL   |      | BTREE      |         |
| openChrom_K562 |          1 | chrom_start |            1 | chrom       | A         |          22 |     NULL | NULL   |      | BTREE      |         |
| openChrom_K562 |          1 | chrom_start |            2 | chromStart  | A         |      137373 |     NULL | NULL   |      | BTREE      |         |
| openChrom_K562 |          1 | chrom_end   |            1 | chrom       | A         |          22 |     NULL | NULL   |      | BTREE      |         |
| openChrom_K562 |          1 | chrom_end   |            2 | chromEnd    | A         |      137373 |     NULL | NULL   |      | BTREE      |         |
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

关于MySQL的执行计划

mysql> EXPLAIN SELECT `inVar`.chrom, `inVar`.pos, score  FROM `inVar`  LEFT JOIN `openChrom_K562`  ON ( inVar.chrom=openChrom_K562.chrom AND  `inVar`.pos BETWEEN chromStart AND chromEnd );
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+
| id | select_type | table          | type  | possible_keys                              | key        | key_len | ref             | rows  | Extra       |
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+
|  1 | SIMPLE      | inVar          | index | NULL                                       | PRIMARY    | 37      | NULL            | 57902 | Using index |
|  1 | SIMPLE      | openChrom_K562 | ref   | start_end,chrom_only,chrom_start,chrom_end | chrom_only | 33      | tmp.inVar.chrom |  5973 |             |
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+

它似乎只是通过在两个表中查看chrom来优化。 然后在表格中进行蛮力比较。

有没有办法进行进一步的优化,比如对位置进行索引?

(这是我第一次发布这个问题,抱歉发布质量很差。)

chrom_only可能是您的连接的错误索引选择,因为您只有chrom 22值。

如果我已经解释了这个权利,那么使用start_end查询应该更快

SELECT `inVar`.chrom, `inVar`.pos, `openChrom_K562`.score
FROM `inVar`
LEFT JOIN `openChrom_K562`
USE INDEX (`start_end`)
ON (
`inVar`.chrom=`openChrom_K562`.chrom AND
`inVar`.pos BETWEEN `openChrom_K562`.chromStart AND `openChrom_K562`.chromEnd
)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM