简体   繁体   English

在MySQL中使用SELECT更快地搜索IN语句

[英]Faster search query for IN statement with SELECT in MySQL

I'm currently doing some query for my app and I need to get the nearest store on my current position and to do this first I need to get all the item that has the same name then get it's information and trim down that query. 我目前正在为我的应用程序进行一些查询,我需要获取当前位置上最近的商店,并且首先要执行此操作,我需要获取所有具有相同名称的商品,然后获取其信息并整理该查询。 Now I used IN statement for this but since the items being searched are also based on a list I need to make use of another select for this here is my code so far: 现在,我使用了IN语句,但是由于要搜索的项也是基于列表,因此我需要为此使用另一个选择,这是到目前为止的代码:

select *
from product p,
store s,
branches b
where 1 = 1
and b.idproduct = p.idproduct
and p.store = s.idstore
and common_name IN(SELECT p.common_name
FROM shopping_list_content s, product p
WHERE 1 =1
AND s.iditem = p.idproduct
AND s.idlist =$listid)

Now it works as I wanted it to be but I wanted it to do the query faster than this. 现在它可以按我希望的方式工作,但是我希望它比现在更快地执行查询。 For now it takes more than 3 seconds for this query to run faster than this. 目前,此查询的运行时间超过3秒。 much better if it is less than a second. 少于一秒会更好。 Any other option I can use for this? 我还有其他选择吗?

MySQL has difficulty optimising subqueries, when you write something like: 当您编写如下内容时,MySQL难以优化子查询:

SELECT  *
FROM    T
WHERE   T.ID (SELECT ID FROM T2);

It is sometimes rewritten as 有时会改写为

SELECT  *
FROM    T
WHERE   EXISTS
        (   SELECT  1
            FROM    T2
            WHERE   T.ID = T2.ID
        );

The subquery is then executed once per row in T , whereas if you write: 然后,子查询在T每行执行一次,而如果您编写:

SELECT  T.*
FROM    T
        INNER JOIN
        (   SELECT  DISTINCT ID
            FROM    T2
        ) T2
            ON T2.ID = T.ID;

Your result set will be the same, but MySQL will first fill an in memory table with the results of the subquery and hash it on T2.ID, it then just needs to lookup against this hash table for each row in T . 您的结果集将是相同的,但是MySQL首先将用子查询的结果填充内存表并将其哈希在T2.ID上,然后只需要针对该哈希表为T每一行进行查询。

Which behaviour you want really depends on how much data you are expecting from each table/subquery. 您真正想要哪种行为取决于每个表/子查询中期望有多少数据。 If you have 1 million rows in T2 , and 10 in T then there is no point in filling a temporary table with 1 million rows, only to subsequently only use it 10 times, whereas if you have a large number of rows in T and only a small amount in T2 the additional cost of materialising the subquery will be beneficial in the long run. 如果您在T2有100万行,而在T有10行,那么用100万行填充临时表是没有意义的,只是随后只能使用10次,而如果T有很多行且仅从长远来看, T2的数量较少,实现子查询的额外成本将是有益的。

Another thing to point out (which has no impact on performance), the JOIN syntax you are using is the ANSI 89 syntax and was replaced by ANSI 92 explicit JOIN syntax over 20 years ago. 需要指出的另一点(这对性能没有影响),您使用的JOIN语法是ANSI 89语法,并在20多年前被ANSI 92显式JOIN语法取代。 Although directed at SQL Server, I think this article summarises the reasons to switch to the newer join syntax very well. 尽管针对SQL Server,但我认为本文总结了很好地切换到较新的联接语法的原因。 Making your final query: 进行最终查询:

SELECT  *
FROM    product p,
        INNER JOIN store s
            ON p.store = s.idstore
        INNER JOIN branches b
            ON b.idproduct = p.idproduct
        INNER JOIN
        (   SELECT DISTINCT p.common_name
            FROM    shopping_list_content s
                    INNER JOIN product p
                        ON s.iditem = p.idproduct
            WHERE   s.idlist =$listid
        ) s
            ON s.common_name = p.common_name;

NB Most of the above does not apply if you are using MySQL 5.6.5 or later. 注意:如果您使用的是MySQL 5.6.5或更高版本,则以上大部分内容均不适用。 In this version they introduced more Subquery Optimization that solved a lot of the above issues 在此版本中,他们引入了更多的子查询优化 ,解决了许多上述问题

This is your query fixed up to use proper join syntax: 这是为使用正确的join语法而修复的查询:

select *
from product p join
     store s
     on p.store = s.idstore join
     branches b
     on b.idproduct = p.idproduct
where p.common_name IN (SELECT p.common_name
                        FROM shopping_list_content slc join
                             product p
                             ON slc.iditem = p.idproduct AND
                                slc.idlist = $listid
                       );

Assuming that the same common_name does not appear on multiple products and that shopping_list_content has no duplicate rows, you can replace this with a simple join : 假设相同的common_name不会出现在多个产品上并且shopping_list_content没有重复的行,则可以用简单的join替换它:

select *
from product p join
     store s
     on p.store = s.idstore join
     branches b
     on b.idproduct = p.idproduct join
     shopping_list_content slc
     on slc.iditem = p.idproduct and
        slc.idlist = $listid;

However, those assumptions may not be true. 但是,这些假设可能不正确。 In that case, changing the subquery to use exists may help performance: 在这种情况下,改变子查询中使用的exists可以帮助提高性能:

select *
from product p join
     store s
     on p.store = s.idstore join
     branches b
     on b.idproduct = p.idproduct
where exists (SELECT 1
              FROM shopping_list_content slc join
                   product p2
                   on slc.iditem = p2.idproduct AND
                      slc.idlist = $listid
              WHERE p.common_name = p2.common_name
             );

For this latter query, an index on product(common_name, idproduct) along with shopping_list_content(iditem, idlist) should help. 对于后一个查询,应该对product(common_name, idproduct)的索引以及shopping_list_content(iditem, idlist)有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM