[英]Optimize MySQL query for group_concat function
SELECT SQL_NO_CACHE link.stop, stop.common_name, locality.name, stop.bearing, stop.latitude, stop.longitude
FROM service
JOIN pattern ON pattern.service = service.code
JOIN link ON link.section = pattern.section
JOIN naptan.stop ON stop.atco_code = link.stop
JOIN naptan.locality ON locality.code = stop.nptg_locality_ref
GROUP BY link.stop
The above query takes roughly 800ms - 1000ms to run. 上面的查询大约需要800毫秒-1000毫秒才能运行。
If I append a group_concat
statement the query then takes 8 - 10 seconds: 如果我附加group_concat
语句,则查询将花费8到10秒:
SELECT SQL_NO_CACHE link.stop, link.stop, stop.common_name, locality.name, stop.bearing, stop.latitude, stop.longitude, group_concat(service.line) lines
How can I change this query so that it runs in less than 2 seconds with the group_concat
statement? 如何更改此查询,以便使用group_concat
语句在不到2秒的时间内运行?
SQL Fiddle: http://sqlfiddle.com/#!9/414fe SQL小提琴: http ://sqlfiddle.com/#!9 / 414fe
EXPLAIN
statements for both queries: http://i.imgur.com/qrURgzV.png 两个查询的EXPLAIN
语句: http : //i.imgur.com/qrURgzV.png
How long does this query take? 此查询需要多长时间?
SELECT p.section, GROUP_CONCAT(s.line)
FROM pattern p join
service s
ON p.service = s.code
GROUP BY p.section
I am thinking that you can do the group_concat()
in a subquery, so the outer query does not need an aggregation. 我认为您可以在子查询中执行group_concat()
,因此外部查询不需要聚合。 This can speed queries when there is one table in the subquery. 当子查询中有一个表时,这可以加快查询速度。 In your case, there are two. 就您而言,有两个。
The final results would be something like: 最终结果将类似于:
link.section = pattern.section link.section = pattern.section
SELECT SQL_NO_CACHE . . .,
(SELECT GROUP_CONCAT(s.line)
FROM pattern p join
service s
ON p.service = s.code
WHERE p.section = link.section
) as lines
FROM link JOIN
naptan.stop
ON stop.atco_code = link.stop JOIN
naptan.locality
ON locality.code = stop.nptg_locality_ref;
For this query, you want the following additional indexes: pattern(section, service)
and service(code, line)
. 对于此查询,您需要以下附加索引: pattern(section, service)
和service(code, line)
。
I don't know if this will work, but it is worth a try. 我不知道这是否行得通,但是值得一试。
Note: this is assuming that you really don't need the group by
for the rest of the columns. 注意:这是假设您确实不需要其余的group by
。
A remark: You're using the nonstandard MySQL extension to GROUP BY . 备注:您正在对GROUP BY使用非标准的MySQL扩展 。 It happens to work for you because link.stop
is joined to stop.atco_code
, which itself is a primary key. 它恰好对您stop.atco_code
,因为link.stop
已加入stop.atco_code
,后者本身是主键。 But you need to be very careful with this. 但是您需要对此非常小心。
I suggest you add some compound indexes. 我建议您添加一些复合索引。 You join in to pattern
on service
and join out based on section
. 您加入service
pattern
并根据section
加入。 So add this index. 因此添加此索引。
ALTER TABLE pattern ADD INDEX service_section (service, section, line);
This will let the query use just the index, and not have to hit the table itself to retrieve the information needed for the JOIN or your GROUP_CONCAT()
operation. 这将使查询仅使用索引,而不必点击表本身即可检索JOIN或GROUP_CONCAT()
操作所需的信息。 (You might also delete the index on just service
, this new index makes it redundant). (您也可以删除just service
上的索引,此新索引使其变得多余)。
Similarly, you want to create an index (section, stop)
on the link
table, and get rid of the index on just section
. 同样,您想在link
表上创建一个索引(section, stop)
,并删除just section
上的索引。
On stop
, you're using most of the columns, and you already have an index (PK) on atco_code
, so let this one be. 在stop
,您使用了大多数列,并且在atco_code
上已经有一个索引(PK),所以就这样吧。
Finally, on locality
put an index on (code,name)
. 最后,在locality
上将索引放在(code,name)
。
All this indexing monkey business should cut down the amount of work MySQL must do to satisfy your query. 所有这些索引猴子业务应该减少MySQL必须满足您的查询的工作量。
Now look, as soon as you add WHERE anything = anything
to the query, you may need to add a column to one or more of these indexes. 现在来看,一旦在查询中添加WHERE anything = anything
,您可能需要向这些索引中的一个或多个添加一列。 You definitely should read up on multi-column indexing and grouping ; 您绝对应该阅读多列 索引和分组 ; good indexing is a critical success factor for your kind of data. 良好的索引编制对于您的数据而言是至关重要的成功因素。
You should also run ANALYZE TABLE xxxx
on each of your tables after inserting lots of rows, to make sure the query optimizer can see appropriate information about the content of the table and indexes. 在插入很多行之后,还应该在每个ANALYZE TABLE xxxx
上运行ANALYZE TABLE xxxx
,以确保查询优化器可以看到有关表和索引内容的适当信息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.