[英]How can I speed up this SQL query on MySQL 4.1?
I have a SQL query that takes a very long time to run on MySQL (it takes several minutes). 我有一个SQL查询,需要很长时间才能在MySQL上运行(需要几分钟)。 The query is run against a table that has over 100 million rows, so I'm not surprised it's slow.
该查询是针对具有超过1亿行的表运行的,因此我并不奇怪它的速度很慢。 In theory, though, it should be possible to speed it up as I really only want to get back the rows from the large table (let's call it A) that have a reference in another table, B.
但是,从理论上讲,应该可以加快速度,因为我真的只想从大表(又称A)中取回在另一个表B中具有引用的行。
So my query is: 所以我的查询是:
SELECT id FROM A, B where A.ref = B.ref;
(A has over 100 million rows; B has just a few thousand). (A的行超过一亿; B的行只有几千)。
I've added INDEXes: 我添加了INDEXes:
alter table A add index(ref);
alter table B add index(ref);
But it's still very slow (several minutes -- I'd be happy with one minute). 但这仍然很慢(几分钟-我会很高兴一分钟)。
Unfortunately, I'm stuck with MySQL 4.1.22, so I can't use views. 不幸的是,我坚持使用MySQL 4.1.22,因此无法使用视图。
I'd rather not copy all of the relevant rows from A into a separate, smaller table, as the rows that I need will change from time to time. 我宁愿不将A中的所有相关行复制到一个单独的较小的表中,因为我需要的行会不时更改。 On the other hand, at the moment that's the only solution I can think of.
另一方面,这是我能想到的唯一解决方案。
Any suggestions welcome! 任何建议欢迎!
EDIT: Here's the output of running EXPLAIN on my query: 编辑:这是在我的查询上运行EXPLAIN的输出:
+----+-------------+------------------------+------+------------------------------------------+-------------------------+---------+------------------------------------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------------+------+------------------------------------------+-------------------------+---------+------------------------------------------------+-------+-------------+
| 1 | SIMPLE | B | ALL | B_ref,ref | NULL | NULL | NULL | 16718 | Using where |
| 1 | SIMPLE | A | ref | A_REF,ref | A_ref | 4 | DATABASE.B.ref | 5655 | |
+----+-------------+------------------------+------+------------------------------------------+-------------------------+---------+------------------------------------------------+-------+-------------+
(In redacting my original query example, I chose to use "ref" as my column name, which happens to be the same as one of the types, but hopefully that's not too confusing...) (在编辑原始查询示例时,我选择使用“ ref”作为列名,该列名恰好与其中一种类型相同,但希望不要太困惑……)
查询优化器可能已经在尽力而为,但是在不太可能的情况下,它首先读取了巨表(A),您可以使用STRAIGHT_JOIN
语法明确地告诉它首先读取B:
SELECT STRAIGHT_JOIN id FROM B, A where B.ref = A.ref;
From the answers, it seems like you're doing the most efficient thing you can with the SQL. 从答案看,您似乎正在使用SQL来完成最有效的事情。 The A table seems to be the big problem, how about splitting it into three individual tables, kind of like a local version of sharding?
A表似乎是个大问题,如何将其拆分为三个单独的表,有点像本地版本的分片? Alternatively, is it worth denormalising the B table into the A table, assuming B doesn't have too many columns?
或者,如果B没有太多列,是否值得将B表规范化为A表?
Finally, you could just have to buy a faster box to run it on - there's no substitute for horsepower! 最后,您可能只需要购买一个运行速度更快的盒子即可运行-功率无可替代!
Good luck. 祝好运。
SELECT id FROM A JOIN B ON A.ref = B.ref 从A.ref = B.ref上的JOIN B选择SELECT ID
You may be able to optimize further by using an appropriate type of join eg LEFT JOIN 您可以通过使用适当的连接类型(例如LEFT JOIN)来进一步优化
http://en.wikipedia.org/wiki/Join_(SQL) http://en.wikipedia.org/wiki/Join_(SQL)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.