简体   繁体   English

mysql 用子查询优化左连接

[英]mysql optimize left join with subquery

I'm concerned about performance.我担心性能。

It is possibile to optimize the following mysql query?是否可以优化以下 mysql 查询?

SELECT u.name, t2.transactions, o2.orders FROM users AS u

LEFT JOIN (
    SELECT t.aid AS tuid, SUM( IF( t.status = 1, t.amount, 0 ) ) AS transactions
    FROM transactions AS t 
    WHERE ( t.datetime BETWEEN ('2018-01-01 00:00:00') AND ( '2020-01-01 23:59:59' ) ) GROUP BY t.aid
) AS t2 ON tuid = u.id 

LEFT JOIN (
    SELECT o.aid AS ouid, SUM(o.net) AS orders FROM orders AS o 
    WHERE ( o.date BETWEEN ('2018-01-01 00:00:00') AND ( '2020-01-01 23:59:59' ) ) GROUP BY o.aid 
) AS o2 ON ouid = u.id

WHERE u.status = 1
ORDER BY t2.transactions DESC

basically I need to sum users' data from multiple tables (and be able to order them)基本上我需要汇总来自多个表的用户数据(并且能够对它们进行排序)

There's no obvious query-performance antipattern in your query.您的查询中没有明显的查询性能反模式。 Performance pretty much depends on the performance of the two subqueries with group by clauses.性能很大程度上取决于带有 group by 子句的两个子查询的性能。

Let's take a look at one of them to find some improvements.让我们看一下其中一个以找到一些改进。

SELECT t.aid AS tuid, 
       SUM( IF( t.status = 1, t.amount, 0 ) ) AS transactions
  FROM afs_transactions AS t 
 WHERE t.datetime BETWEEN '2018-01-01 00:00:00' AND '2020-01-01 23:59:59'
 GROUP BY t.aid

This will be OK if you have an index on afs_transactions.datetime .如果您在afs_transactions.datetime上有索引,这将是可以的。

But the whole subquery can be rewritten但是整个子查询可以重写

SELECT t.aid AS tuid, 
       SUM( t.amount ) AS transactions
  FROM afs_transactions AS t 
 WHERE t.datetime BETWEEN '2018-01-01 00:00:00' AND '2020-01-01 23:59:59'
   AND t.status = 1
 GROUP BY t.aid

This query will take advantage of a compound index on (status, datetime) .此查询将利用(status, datetime)上的复合索引。 If you have many rows with status values not equal to 1 , and you have the compound index, the rewritten query will be faster.如果您有许多status值不等于1的行,并且您有复合索引,则重写查询会更快。

Pro tip : BETWEEN for datetime values is generally a poor choice, because, well, 59:59.专业提示:日期时间值的BETWEEN通常是一个糟糕的选择,因为 59:59。 Try using < rather than BETWEEN's <= for the end of the range.尝试使用<而不是 BETWEEN 的<=作为范围的结尾。

 WHERE t.datetime >= '2018-01-01'
   AND t.datetime <  '2020-01-02'   /* notice, it's the day after the range */

Multiple JOIN ( SELECT... ) used to be a performance killer (pre 5.6).多个JOIN ( SELECT... )曾经是性能杀手(5.6 之前)。 Now it may be a performance problem.现在可能是性能问题。

The alternative is替代方案是

SELECT u.name,
       ( SELECT ... WHERE ...=u.id ) AS transactions,
       ( SELECT ... WHERE ...=u.id ) AS orders
    FROM users AS u
    WHERE  u.status = 1
    ORDER BY  transactions DESC

The first subquery is a correlated subquery and it looks like第一个子查询是一个相关的子查询,它看起来像

       ( SELECT SUM( IF(status = 1, amount, 0)
            FROM  transactions
            WHERE  aid = u.id
              AND  datetime >= '2018-01-01'
              AND  datetime  < '2018-01-01' + INTERVAL 2 YEAR`
       ) AS transactions

(The other one is similar.) (另一个类似。)

Indexes:索引:

users:         INDEX(status, name, id)   -- "covering"
transactions:  INDEX(aid, datetime)
orders:        INDEX(aid, date)  or  INDEX(aid, date, net)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM