简体   繁体   English

Mysql - 优化 - 使用having的多个group_concat和join

[英]Mysql - optimisation - multiple group_concat & joins using having

I've looked at similar group_concat mysql optimisation threads but none seem relevant to my issue, and my mysql knowledge is being stretched with this one. 我看过类似的group_concat mysql优化线程,但似乎没有一个与我的问题相关,我的mysql知识正在被这个延伸。

I have been tasked with improving the speed of a script with an extremely heavy Mysql query contained within. 我的任务是提高脚本的速度,其中包含非常繁重的Mysql查询。

The query in question uses GROUP_CONCAT to create a list of colours, tags and sizes all relevant to a particular product. 有问题的查询使用GROUP_CONCAT创建一个与特定产品相关的颜色,标签和尺寸列表。 It then uses HAVING / FIND_IN_SET to filter these concatenated lists to find the attribute, set by the user controls and display the results. 然后,它使用HAVING / FIND_IN_SET过滤这些连接列表以查找由用户控件设置的属性并显示结果。

In the example below it's looking for all products with product_tag=1, product_colour=18 and product_size=17. 在下面的示例中,它正在查找product_tag = 1,product_colour = 18和product_size = 17的所有产品。 So this could be a blue product (colour) in medium (size) for a male (tag). 因此,对于男性(标签),这可以是中等(大小)的蓝色产品(颜色)。

The shop_products tables contains about 3500 rows, so is not particularly large, but the below takes around 30 seconds to execute. shop_products表包含大约3500行,因此不是特别大,但下面大约需要30秒才能执行。 It works OK with 1 or 2 joins, but adding in the third just kills it. 它可以在1或2个连接中正常工作,但在第三个连接中添加只会杀死它。

SELECT shop_products.id, shop_products.name, shop_products.default_image_id, 
GROUP_CONCAT( DISTINCT shop_product_to_colours.colour_id ) AS product_colours, 
GROUP_CONCAT( DISTINCT shop_products_to_tag.tag_id ) AS product_tags, 
GROUP_CONCAT( DISTINCT shop_product_colour_to_sizes.tag_id ) AS product_sizes
FROM shop_products
LEFT JOIN shop_product_to_colours ON shop_products.id = shop_product_to_colours.product_id
LEFT JOIN shop_products_to_tag ON shop_products.id = shop_products_to_tag.product_id
LEFT JOIN shop_product_colour_to_sizes ON shop_products.id = shop_product_colour_to_sizes.product_id
WHERE shop_products.category_id =  '50'
GROUP BY shop_products.id
HAVING((FIND_IN_SET( 1, product_tags ) >0) 
AND(FIND_IN_SET( 18, product_colours ) >0)
AND(FIND_IN_SET( 17, product_sizes ) >0))
ORDER BY shop_products.name ASC 
LIMIT 0 , 30

I was hoping somebody could generally advise a better way to structure this query without re-structuring the database (which isn't really an option at this point without weeks of data migration and script changes)? 我希望有人一般可以建议一种更好的方法来构建这个查询而不重新构建数据库(在没有数周的数据迁移和脚本更改的情况下,此时这不是一个真正的选项)? Or any general advise on optimisation. 或任何关于优化的一般建议。 Using explain currently returns the below (as you can see the indexes are all over the place!). 使用explain当前返回以下内容(正如您所看到的那样,索引遍布各处!)。

id  select_type table                          type possible_keys                         key           key_len ref rows            Extra   
1   SIMPLE      shop_products                  ref  category_id,category_id_2             category_id   2   const   3225    Using where; Using temporary; Using filesort
1   SIMPLE      shop_product_to_colours        ref  product_id,product_id_2,product_id_3  product_id    4   candymix_db.shop_products.id    13  
1   SIMPLE      shop_products_to_tag           ref  product_id,product_id_2               product_id    4   candymix_db.shop_products.id    4   
1   SIMPLE      shop_product_colour_to_sizes   ref  product_id                            product_id    4   candymix_db.shop_products.id    133 

Rewrite query to use WHERE instead of HAVING . 重写查询以使用WHERE而不是HAVING Because WHERE is applied when MySQL performs search on rows and it can use index. 因为当MySQL在行上执行搜索并且它可以使用索引时应用WHERE HAVING is applied after rows are selected to filter already selected result. 选择行后应用HAVING以过滤已选择的结果。 HAVING by design can't use indexes. HAVING by design不能使用索引。
You can do it, for example, this way: 你可以这样做,例如,这样:

SELECT p.id, p.name, p.default_image_id, 
    GROUP_CONCAT( DISTINCT pc.colour_id ) AS product_colours, 
    GROUP_CONCAT( DISTINCT pt.tag_id ) AS product_tags, 
    GROUP_CONCAT( DISTINCT ps.tag_id ) AS product_sizes
FROM shop_products p
    JOIN shop_product_to_colours pc_test ON p.id = pc_test.product_id AND pc_test.colour_id = 18
    JOIN shop_products_to_tag pt_test ON p.id = pt_test.product_id AND pt_test.tag_id = 1
    JOIN shop_product_colour_to_sizes ps_test ON p.id = ps_test.product_id AND ps_test.tag_id = 17
    JOIN shop_product_to_colours pc ON p.id = pc.product_id
    JOIN shop_products_to_tag pt ON p.id = pt.product_id
    JOIN shop_product_colour_to_sizes ps ON p.id = ps.product_id
WHERE p.category_id =  '50'
GROUP BY p.id
ORDER BY p.name ASC

Update 更新

We are joining each table two times. 我们两次加入每张桌子。
First to check if it contains some value (condition from FIND_IN_SET ). 首先检查它是否包含某些值(来自FIND_IN_SET条件)。
Second join will produce data for GROUP_CONCAT to select all product values from table. 第二次连接将为GROUP_CONCAT生成数据,以从表中选择所有产品值。

Update 2 更新2

As @Matt Raines commented, if we don't need list product values with GROUP_CONCAT , query becomes even simplier: 正如@Matt Raines评论的那样,如果我们不需要使用GROUP_CONCAT列表产品值,查询变得更加简单:

SELECT p.id, p.name, p.default_image_id
FROM shop_products p
    JOIN shop_product_to_colours pc ON p.id = pc.product_id
    JOIN shop_products_to_tag pt ON p.id = pt.product_id
    JOIN shop_product_colour_to_sizes ps ON p.id = ps.product_id
WHERE p.category_id =  '50'
    AND (pc.colour_id = 18 AND pt.tag_id = 1 AND ps.tag_id = 17)
GROUP BY p.id
ORDER BY p.name ASC

This will select all products with three filtered attributes. 这将选择具有三个过滤属性的所有产品。

I think if I understand this question, what you need to do is: 我想如果我理解这个问题,你需要做的是:

  1. Find a list of all of the shop_product.id 's that have the correct tag/color/size options 查找具有正确标签/颜色/尺寸选项的所有shop_product.id的列表
  2. Get a list of all of the tag/color/size combinations available for that product id. 获取该产品ID可用的所有标签/颜色/尺寸组合的列表。

I was trying to make you a SQLFiddle for this, but the site seems broken at the moment. 我试图让你成为一个SQLFiddle,但该网站目前似乎已被打破。 Try something like: 尝试类似的东西:

SELECT shop_products.id, shop_products.name, shop_products.default_image_id, 
GROUP_CONCAT( DISTINCT shop_product_to_colours.colour_id ) AS product_colours, 
GROUP_CONCAT( DISTINCT shop_products_to_tag.tag_id ) AS product_tags, 
GROUP_CONCAT( DISTINCT shop_product_colour_to_sizes.tag_id ) AS product_sizes
FROM 
shop_products INNER JOIN
(SELECT shop_products.id id, 
 FROM
 shop_products
 LEFT JOIN shop_product_to_colours ON shop_products.id = shop_product_to_colours.product_id
 LEFT JOIN shop_products_to_tag ON shop_products.id = shop_products_to_tag.product_id
 LEFT JOIN shop_product_colour_to_sizes ON shop_products.id = shop_product_colour_to_sizes.product_id
 WHERE
 shop_products.category_id =  '50'
 shop_products_to_tag.tag_id=1
 shop_product_to_colours.colour_id=18
 shop_product_colour_to_sizes.tag_id=17
) matches ON shop_products.id = matches.id
LEFT JOIN shop_product_to_colours ON shop_products.id = shop_product_to_colours.product_id
LEFT JOIN shop_products_to_tag ON shop_products.id = shop_products_to_tag.product_id
LEFT JOIN shop_product_colour_to_sizes ON shop_products.id = shop_product_colour_to_sizes.product_id
GROUP BY shop_products.id
ORDER BY shop_products.name ASC 
LIMIT 0 , 30;

The problem with you first approach is that it requires the database to create every combination of every product and then filter. 第一种方法的问题是它需要数据库创建每个产品的每个组合然后过滤。 In my example, I'm filtering down the product id's first then generating the combinations. 在我的示例中,我首先过滤产品ID,然后生成组合。

My query is untested as I don't have a MySQL Environment handy and SQLFiddle is down, but it should give you the idea. 我的查询未经测试,因为我没有方便的MySQL环境而SQLFiddle已关闭,但它应该给你这个想法。

First, I aliased your queries to shorten readability. 首先,我将您的查询别名以缩短可读性。

SP = Shop_Products
PC = Shop_Products_To_Colours
PT = Shop_Products_To_Tag
PS = Shop_Products_To_Sizes

Next, your having should be a WHERE since you are explicitly looking FOR something. 接下来,你应该是一个WHERE,因为你明确地寻找一些东西。 No need trying to query the entire system just to throw records after the result is returned. 在返回结果后,无需尝试查询整个系统只是为了抛出记录。 Third, you had LEFT-JOIN, but when applicable to a WHERE or HAVING, and you are not allowing for NULL, it forces TO a JOIN (both parts required). 第三,你有LEFT-JOIN,但是当适用于WHERE或HAVING,并且你不允许NULL时,它强制TO JOIN(两个部分都需要)。 Finally, your WHERE clause has quotes around the ID you are looking for, but that is probably integer anyhow. 最后,您的WHERE子句在您要查找的ID周围有引号,但无论如何这可能是整数。 Remove the quotes. 删除引号。

Now, for indexes and optimization there. 现在,对于索引和优化那里。 To help with the criteria, grouping, and JOINs, I would have the following composite indexes (multiple fields) instead of a table with just individual columns as the index. 为了帮助处理条件,分组和JOIN,我将使用以下复合索引(多个字段),而不是仅使用单个列作为索引的表。

table                     index
Shop_Products             ( category_id, id, name )
Shop_Products_To_Colours  ( product_id, colour_id )
Shop_Products_To_Tag      ( product_id, tag_id )
Shop_Products_To_Sizes    ( product_id, tag_id )

Revised query 修改了查询

SELECT 
      SP.id, 
      SP.name, 
      SP.default_image_id, 
      GROUP_CONCAT( DISTINCT PC.colour_id ) AS product_colours, 
      GROUP_CONCAT( DISTINCT PT.tag_id ) AS product_tags, 
      GROUP_CONCAT( DISTINCT PS.tag_id ) AS product_sizes
   FROM 
      shop_products SP
         JOIN shop_product_to_colours PC
            ON SP.id = PC.product_id
           AND PC.colour_id = 18
         JOIN shop_products_to_tag PT
            ON SP.id = PT.product_id
           AND PT.tag_id = 1
         JOIN shop_product_colour_to_sizes PS
            ON SP.id = PS.product_id
           AND PS.tag_id = 17
   WHERE 
      SP.category_id = 50
   GROUP BY 
      SP.id
   ORDER BY 
      SP.name ASC 
   LIMIT 
      0 , 30

One Final comment. 最后一条评论。 Since you are ordering by the NAME, but grouping by the ID, it might cause a delay in the final sorting. 由于您按NAME排序,但按ID分组,可能会导致最终排序延迟。 HOWEVER, if you change it to group by the NAME PLUS ID, you will still be unique by the ID, but an adjusted index ON your Shop_Products to 但是,如果您通过NAME PLUS ID将其更改为分组,则ID仍然是唯一的,但是您的Shop_Products上的调整后的索引

table                     index
Shop_Products             ( category_id, name, id )

will help both the group AND order since they will be in natural order from the index. 将帮助组和订单,因为它们将从索引的自然顺序。

   GROUP BY 
      SP.name,
      SP.id
   ORDER BY 
      SP.name ASC,
      SP.ID

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM