简体   繁体   English

MySQL的OR和ISNULL性能差

[英]MySQL poor OR and ISNULL performance

I'm really surprised with some strange mysql performance behaviours. 我对某些奇怪的mysql性能行为感到非常惊讶。 My following query is taking about 3 hours to run: 我的以下查询大约需要3个小时才能运行:

UPDATE ips_invoice AS f SET ips_locality_id = (
        SELECT ips_locality_id 
        FROM ips_user_unit_locality AS uul 
        JOIN ips_user AS u ON u.id = uul.ips_user_id 
        WHERE 
            (u.id = f.ips_user_id OR u.ips_user_id_holder = f.ips_user_id) AND 
            uul.date <= f.date 

        ORDER BY `date` DESC 
        LIMIT 1 
) 
WHERE f.ips_locality_id IS NULL;

I also tried the following one, but get same performance results: 我还尝试了以下方法,但获得了相同的性能结果:

UPDATE ips_invoice AS f SET ips_locality_id = (
        SELECT ips_locality_id 
        FROM ips_user_unit_locality AS uul 
        JOIN ips_user AS u ON u.id = uul.ips_user_id 
        WHERE 
            IFNULL(u.ips_user_id_holder, u.id) = f.ips_user_id 
            AND 
            uul.date <= f.date 

        ORDER BY `date` DESC 
        LIMIT 1 
) 
WHERE f.ips_locality_id IS NULL;

The logic is: if the "ips_user_id_holder" column is not null, I should use it, if not I should use "id" column. 逻辑是:如果“ ips_user_id_holder”列不为空,则应使用它;否则,应使用“ id”列。

If I split the query into two queries, each one take 15 seconds to run: 如果我将查询分为两个查询,则每个查询需要15秒才能运行:

     UPDATE ips_invoice AS f SET ips_locality_id = (
                SELECT ips_locality_id 
                FROM ips_user_unit_locality AS uul 
                JOIN ips_user AS u ON u.id = uul.ips_user_id 
                WHERE 
                    u.ips_user_id_holder = f.ips_user_id 
                    AND 
                    uul.date <= f.date 

                ORDER BY `date` DESC 
                LIMIT 1 
        ) 
        WHERE f.ips_locality_id IS NULL;

UPDATE ips_invoice AS f SET ips_locality_id = (
                SELECT ips_locality_id 
                FROM ips_user_unit_locality AS uul 
                JOIN ips_user AS u ON u.id = uul.ips_user_id 
                WHERE 
                    u.id = f.ips_user_id 
                    AND 
                    uul.date <= f.date 

                ORDER BY `date` DESC 
                LIMIT 1 
        ) 
        WHERE f.ips_locality_id IS NULL;

It is not the first time I got in issues with Mysql "OR" or "null checks" in relatively simple queries ( Why this mysql query (with is null check) is so slower than this other one? ). 这不是我第一次在相对简单的查询中遇到Mysql“ OR”或“ null checks”问题( 为什么这个mysql查询(带有null检查)比另一个查询慢呢? )。

The ips_invoice table has about 400.000 records, the ips_user_unit_locality about 100.000 records and ips_user about 35.000 records. ips_invoice表大约有400.000条记录,ips_user_unit_locality大约100.000条记录,ips_user大约35.000条记录。

I'm running MySQL 5.5.49 in an Ubuntu Amazon EC2 instance. 我在Ubuntu Amazon EC2实例中运行MySQL 5.5.49。

So, what is wrong with the first and second queries? 那么,第一个和第二个查询出了什么问题? What is the cause of significant performance difference? 造成明显性能差异的原因是什么?

There is nothing "wrong" with the first and second queries. 第一个和第二个查询没有什么“错误”。 However, when you use or in a join condition (or equivalently, a correlated subquery condition), then the engine usually cannot use indexes. 但是,当您使用or处于join条件(或等效地,相关子查询条件)中时,引擎通常无法使用索引。

That makes everything really slow. 这使得一切真的很慢。

You seem to understand at least one way to fix it, so I won't propose anything else. 您似乎至少了解一种解决方法,所以我不会提出其他建议。

EDIT: 编辑:

I will note that your query doesn't do exactly what you specify in the text. 我将注意到您的查询并没有完全按照您在文本中指定的内容进行操作。 It gets the latest date for either of the two user ids. 它获取两个用户ID的最新日期。 You seem to want to prioritize the ids. 您似乎想优先考虑ID。 If so, this is more the query you want: 如果是这样,则更多是您想要的查询:

UPDATE ips_invoice f
    SET ips_locality_id =
        COALESCE( (SELECT ips_locality_id 
                   FROM ips_user_unit_locality uul JOIN
                        ips_user u
                        ON u.id = uul.ips_user_id 
                   WHERE u.ips_user_id_holder, f.ips_user_id AND
                         uul.date <= f.date 
                   ORDER BY uul.date DESC
                   LIMIT 1
                  ),
                  (SELECT ips_locality_id 
                   FROM ips_user_unit_locality uul
                   WHERE uul.ips_user_id = f.ips_user_id AND
                         uul.date <= f.date 
                   ORDER BY uul.date DESC
                   LIMIT 1
                  )
                )
WHERE f.ips_locality_id IS NULL;
  1. Use a multi-table UPDATE instead of = ( SELECT ...) 使用多表UPDATE而不是= ( SELECT ...)

  2. Instead of OR , write two separate UPDATEs . 代替OR ,编写两个单独的UPDATEs

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM