简体   繁体   中英

SQL Nested set query with join and sub query using “in” slow

I have some trouble with a query on a nested set structure that is slow (~4 sec.)

select node.ID, node.Lft, node.Rgt, node.Level, count(ag.ArticleID) as count
from Webshop.Category node
left join Webshop.Category parent on parent.Lft between node.Lft and node.Rgt
left join Webshop.ArticleGroup ag on parent.GroupID = ag.GroupID
    and ag.ArticleID in (
        select a.ID
        from Webshop.Article a
        join Webshop.ArticleOwner ao on ao.ArticleID = a.ID and ao.OWNRID = 1
        join Webshop.ArticleAssortment aa on aa.ArticleID = a.ID and aa.AssortmentID = 6
    )
group by node.ID
having count > 0

The explain returns the following:

id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,PRIMARY,node,ALL,NULL,NULL,NULL,NULL,2538,"Using temporary; Using filesort"
1,PRIMARY,parent,ALL,Lft,NULL,NULL,NULL,2538,
1,PRIMARY,ag,ref,fk_ArticleGroup_Group1_idx,fk_ArticleGroup_Group1_idx,4,Webshop.parent.GroupID,9,"Using index"
2,"DEPENDENT SUBQUERY",a,eq_ref,PRIMARY,PRIMARY,4,func,1,"Using index"
2,"DEPENDENT     SUBQUERY",ao,eq_ref,"ArticleIDOWNRID,fk_ArticleOwner_Article1_idx,fk_ArticleOwner_OWNR1_idx",ArticleI    DOWNRID,8,"Webshop.a.ID,const",1,"Using index"
2,"DEPENDENT SUBQUERY",aa,eq_ref,"PRIMARY,fk_ArticleAssortment_Article1_idx",PRIMARY,8,"Webshop.a.ID,const",1,"Using index"

I think the sub query with in() is making the query slow. Is there a better way to do this?

Thanks.

EDIT:

I forgot to remove the left from join in the sub query.

You could try JOIN ing:

select node.ID, node.Lft, node.Rgt, node.Level, count(ag.ArticleID) as count
from Webshop.Category node
left join Webshop.Category parent on parent.Lft between node.Lft and node.Rgt
left join Webshop.ArticleGroup ag on parent.GroupID = ag.GroupID
        join (
        select a.ID as `id`
        from Webshop.Article a
        left join Webshop.ArticleOwner ao on ao.ArticleID = a.ID and ao.OWNRID = 1
        left join Webshop.ArticleAssortment aa on aa.ArticleID = a.ID and aa.AssortmentID = 6
    ) in_ids on ag.ArticleID = in_ids.id
group by node.ID
having count > 0

First off, you have a load of left joins here and left joins are slow - I assume you're using them because they're necessary here (ie the joined tables may not contain matches and you want to return every row of Webshop.Category regardless of presence of joined rows). If left joins are not necessary I would convert to inner joins. If left joins are necessary I would give some thought to better database design to eliminate the need for these left joins. It may be unavoidable, but I would be surprised.

If converting to inner joins is not possible, or if the query is still slow then you might try running the sub query first as an insert into statement, creating a temp table, and then using that temp table in place of the sub query (probably as a join rather than an "in").

If that's still slow you might try indexing that temp table on ID.

(In all of the above, I assume the tables are sensibly indexed?)

Your IN sub-query contains LEFT JOIN clauses that don't affect the result, so first of all you can change it to this:

select node.ID, node.Lft, node.Rgt, node.Level, count(ag.ArticleID) as count
from Webshop.Category node
left join Webshop.Category parent on parent.Lft between node.Lft and node.Rgt
left join Webshop.ArticleGroup ag on parent.GroupID = ag.GroupID
    and ag.ArticleID in (
        select a.ID
        from Webshop.Article a
    )
group by node.ID
having count > 0

Now once we have done this, we can see that you are checking the consistency of the AritcleID foreign key. So there are two options. First, there is a foreign key and you don't have to worry about this, remove IN completely:

select node.ID, node.Lft, node.Rgt, node.Level, count(ag.ArticleID) as count
from Webshop.Category node
left join Webshop.Category parent on parent.Lft between node.Lft and node.Rgt
left join Webshop.ArticleGroup ag on parent.GroupID = ag.GroupID
group by node.ID
having count > 0

Second option, you don't have the foreign key. It is not a bad idea to create one.

UPDATE: I have just noticed that you have count > 0 in the end of the query, which means that you can replace all left join clauses with inner join with same result.

As you replaced left joins with inner join in the sub-query the only thing that might change the speed here is replacement of IN with EXISTS :

select node.ID, node.Lft, node.Rgt, node.Level, count(ag.ArticleID) as count
from Webshop.Category node
inner join Webshop.Category parent on parent.Lft between node.Lft and node.Rgt
inner join Webshop.ArticleGroup ag on parent.GroupID = ag.GroupID
WHERE EXISTS (
    SELECT * FROM
        from Webshop.Article a
        join Webshop.ArticleOwner ao on ao.ArticleID = a.ID and ao.OWNRID = 1
        join Webshop.ArticleAssortment aa on aa.ArticleID = a.ID and aa.AssortmentID = 6 
    WHERE a.ID = ag.ArticleID
    )
group by node.ID
-- you don't need HAVING clause here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM