简体   繁体   English

优化MySQL查询的group_concat函数

[英]Optimize MySQL query for group_concat function

SELECT SQL_NO_CACHE link.stop, stop.common_name, locality.name, stop.bearing, stop.latitude, stop.longitude
FROM service
JOIN pattern ON pattern.service = service.code
JOIN link ON link.section = pattern.section
JOIN naptan.stop ON stop.atco_code = link.stop
JOIN naptan.locality ON locality.code = stop.nptg_locality_ref
GROUP BY link.stop

The above query takes roughly 800ms - 1000ms to run. 上面的查询大约需要800毫秒-1000毫秒才能运行。

If I append a group_concat statement the query then takes 8 - 10 seconds: 如果我附加group_concat语句,则查询将花费8到10秒:

SELECT SQL_NO_CACHE link.stop, link.stop, stop.common_name, locality.name, stop.bearing, stop.latitude, stop.longitude, group_concat(service.line) lines

How can I change this query so that it runs in less than 2 seconds with the group_concat statement? 如何更改此查询,以便使用group_concat语句在不到2秒的时间内运行?

SQL Fiddle: http://sqlfiddle.com/#!9/414fe SQL小提琴: http ://sqlfiddle.com/#!9 / 414fe

EXPLAIN statements for both queries: http://i.imgur.com/qrURgzV.png 两个查询的EXPLAIN语句: http : //i.imgur.com/qrURgzV.png

How long does this query take? 此查询需要多长时间?

SELECT p.section, GROUP_CONCAT(s.line)
FROM pattern p join
     service s
     ON p.service = s.code
GROUP BY p.section

I am thinking that you can do the group_concat() in a subquery, so the outer query does not need an aggregation. 我认为您可以在子查询中执行group_concat() ,因此外部查询不需要聚合。 This can speed queries when there is one table in the subquery. 当子查询中有一个表时,这可以加快查询速度。 In your case, there are two. 就您而言,有两个。

The final results would be something like: 最终结果将类似于:

link.section = pattern.section link.section = pattern.section

SELECT SQL_NO_CACHE . . .,
       (SELECT GROUP_CONCAT(s.line)
        FROM pattern p join
             service s
             ON p.service = s.code
        WHERE p.section = link.section
       ) as lines
FROM link JOIN
     naptan.stop
     ON stop.atco_code = link.stop JOIN
     naptan.locality
     ON locality.code = stop.nptg_locality_ref;

For this query, you want the following additional indexes: pattern(section, service) and service(code, line) . 对于此查询,您需要以下附加索引: pattern(section, service)service(code, line)

I don't know if this will work, but it is worth a try. 我不知道这是否行得通,但是值得一试。

Note: this is assuming that you really don't need the group by for the rest of the columns. 注意:这是假设您确实不需要其余的group by

A remark: You're using the nonstandard MySQL extension to GROUP BY . 备注:您正在对GROUP BY使用非标准的MySQL扩展 It happens to work for you because link.stop is joined to stop.atco_code , which itself is a primary key. 它恰好对您stop.atco_code ,因为link.stop已加入stop.atco_code ,后者本身是主键。 But you need to be very careful with this. 但是您需要对此非常小心。

I suggest you add some compound indexes. 我建议您添加一些复合索引。 You join in to pattern on service and join out based on section . 您加入service pattern并根据section加入。 So add this index. 因此添加此索引。

ALTER TABLE pattern ADD INDEX service_section (service, section, line);

This will let the query use just the index, and not have to hit the table itself to retrieve the information needed for the JOIN or your GROUP_CONCAT() operation. 这将使查询仅使用索引,而不必点击表本身即可检索JOIN或GROUP_CONCAT()操作所需的信息。 (You might also delete the index on just service , this new index makes it redundant). (您也可以删除just service上的索引,此新索引使其变得多余)。

Similarly, you want to create an index (section, stop) on the link table, and get rid of the index on just section . 同样,您想在link表上创建一个索引(section, stop) ,并删除just section上的索引。

On stop , you're using most of the columns, and you already have an index (PK) on atco_code , so let this one be. stop ,您使用了大多数列,并且在atco_code上已经有一个索引(PK),所以就这样吧。

Finally, on locality put an index on (code,name) . 最后,在locality上将索引放在(code,name)

All this indexing monkey business should cut down the amount of work MySQL must do to satisfy your query. 所有这些索引猴子业务应该减少MySQL必须满足您的查询的工作量。

Now look, as soon as you add WHERE anything = anything to the query, you may need to add a column to one or more of these indexes. 现在来看,一旦在查询中添加WHERE anything = anything ,您可能需要向这些索引中的一个或多个添加一列。 You definitely should read up on multi-column indexing and grouping ; 您绝对应该阅读多列 索引分组 good indexing is a critical success factor for your kind of data. 良好的索引编制对于您的数据而言是至关重要的成功因素。

You should also run ANALYZE TABLE xxxx on each of your tables after inserting lots of rows, to make sure the query optimizer can see appropriate information about the content of the table and indexes. 在插入很多行之后,还应该在每个ANALYZE TABLE xxxx上运行ANALYZE TABLE xxxx ,以确保查询优化器可以看到有关表和索引内容的适当信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM