简体   繁体   English

MySQL子查询优化

[英]MySQL subquery optimisation

I have 4 tables in minimal view here:我在这里的最小视图中有 4 个表:

Sales:销售量:

id
has_discount
discount_is_percentage
discount_amount
**sale_date_time**
**order_status**

Sales_items: Sales_items:

id
**sales_id**
has_discount
discount_is_percentage
discount_amount
**product_id** (This can sometimes be null)
price_inc_vat_per_item
quantity
vat_rate
is_removed

Sales_payments:销售_付款:

id
**sales_id**
payment_amount
payment_change
payment_method

Products:产品:

id
product_name

I have a query which calculates the discount on the fly and report on it.我有一个查询,它可以即时计算折扣并报告它。 This works great where the total number of records stayed below 100-200k.这在记录总数低于 100-200k 的情况下非常有效。 But as the number increasing, the time taken is really slow.但是随着数量的增加,花费的时间真的很慢。 I guess that's because of the subquery I am using.我想这是因为我使用的子查询。 Anyone could shed a light on this please.任何人都可以对此有所了解。 There is a client_id and outlet_id on each table that distinguish them from other users in the system.每个表上都有一个 client_id 和 outlet_id,用于将它们与系统中的其他用户区分开来。

Currently the tables have 1-3 million rows and the client in question have 300k-600k.目前这些表有 1-3 百万行,有问题的客户端有 300k-600k。 The query takes 30+ seconds.查询需要 30+ 秒。 For others with low amount of rows can get it in even sub-seconds.对于行数较少的其他人,甚至可以在亚秒内获得它。 The ones with stars are the indices.带星号的是索引。 How can the query be improved to get the same desired results?如何改进查询以获得相同的预期结果? The query I have now:我现在的查询:

SELECT  DATE_FORMAT(CONVERT_TZ(sales.sale_date_time,'UTC','Europe/London'),
                '%l%p') as title, count(*) as total_sales, SUM(sales_items.quantity
                   ) as total_quantities,
        SUM(sales_items.price_before_line_discount) as price_before_line_discount,
        SUM(sales_items.price_before_line_discount-sales_items.line_discount) as price_after_line_discount,
        SUM(sales_items.vat_rated_sales) as vat_rated_sales_before_discount,
        SUM(sales_items.zero_rated_sales) as zero_rated_sales_before_discount,
        SUM(sales_items.total_vat_only) as total_vat_only_before_discount,
        SUM(sales_payments.payment_taken) as payment_taken, SUM(sales_items.line_discount) as total_line_discount,
        SUM(sales_payments.payment_cash) as payment_cash, SUM( CASE WHEN sales.has_discount=1
              AND  sales.discount_is_percentage=0 THEN sales.discount_amount WHEN sales.has_discount=1
              AND  sales.discount_is_percentage=1 THEN ((sales_items.price_before_line_discount-sales_items.line_discount)*sales.discount_amount/100) WHEN sales.has_discount=0 THEN 0 END 
           )as total_sales_discount,
        SUM( CASE WHEN sales.has_discount=1 THEN CASE WHEN discount_is_percentage=0 THEN (sales_items.vat_rated_sales*sales.discount_amount)/(sales_items.price_before_line_discount-sales_items.line_discount) WHEN discount_is_percentage=1 THEN (sales_items.vat_rated_sales*((sales_items.price_before_line_discount-sales_items.line_discount)*sales.discount_amount/100))/(sales_items.price_before_line_discount-sales_items.line_discount) END ELSE 0 END )as vat_rated_sales_discount,
        SUM( CASE WHEN sales.has_discount=1 THEN CASE WHEN discount_is_percentage=0 THEN (sales_items.zero_rated_sales*sales.discount_amount)/(sales_items.price_before_line_discount-sales_items.line_discount) WHEN discount_is_percentage=1 THEN ((sales_items.zero_rated_sales*((sales_items.price_before_line_discount-sales_items.line_discount)*sales.discount_amount/100))/(sales_items.price_before_line_discount-sales_items.line_discount)) END ELSE 0 END )as zero_rated_sales_discount,
        SUM( CASE WHEN sales.has_discount=1 THEN CASE WHEN discount_is_percentage=0 THEN (sales_items.total_vat_only*sales.discount_amount)/(sales_items.price_before_line_discount-sales_items.line_discount) WHEN discount_is_percentage=1 THEN (sales_items.total_vat_only*((sales_items.price_before_line_discount-sales_items.line_discount)*sales.discount_amount/100))/(sales_items.price_before_line_discount-sales_items.line_discount) END ELSE 0 END )as total_vat_only_discount
    FROM  `sales`
    left join  
    (
        SELECT  sales_id, SUM(quantity) as quantity, SUM(price_inc_vat_per_item*quantity) AS price_before_line_discount,
                SUM( CASE WHEN has_discount=1
                      AND  discount_is_percentage=0 THEN discount_amount WHEN has_discount=1
                      AND  discount_is_percentage=1 THEN ((price_inc_vat_per_item*quantity)*discount_amount/100) WHEN has_discount=0 THEN 0 END 
                   )as line_discount,
                SUM( CASE WHEN vat_rate>0 THEN CASE WHEN has_discount=1
                      AND  discount_is_percentage=0 THEN ((price_inc_vat_per_item*quantity)-discount_amount) WHEN has_discount=1
                      AND  discount_is_percentage=1 THEN ((price_inc_vat_per_item*quantity)-((price_inc_vat_per_item*quantity)*discount_amount/100)) WHEN has_discount=0 THEN (price_inc_vat_per_item*quantity) END ELSE 0 END 
                   )as vat_rated_sales,
                SUM( CASE WHEN vat_rate=0 THEN CASE WHEN has_discount=1
                      AND  discount_is_percentage=0 THEN ((price_inc_vat_per_item*quantity)-discount_amount) WHEN has_discount=1
                      AND  discount_is_percentage=1 THEN ((price_inc_vat_per_item*quantity)-((price_inc_vat_per_item*quantity)*discount_amount/100)) WHEN has_discount=0 THEN (price_inc_vat_per_item*quantity) END ELSE 0 END 
                   )as zero_rated_sales,
                SUM( CASE WHEN vat_rate>0 THEN CASE WHEN has_discount=1
                      AND  discount_is_percentage=0 THEN ((price_inc_vat_per_item*quantity)-discount_amount)-((price_inc_vat_per_item*quantity)-discount_amount)/(1+(vat_rate/100)) WHEN has_discount=1
                      AND  discount_is_percentage=1 THEN ((price_inc_vat_per_item*quantity)-((price_inc_vat_per_item*quantity)*discount_amount/100))-((price_inc_vat_per_item*quantity)-((price_inc_vat_per_item*quantity)*discount_amount/100))/(1+(vat_rate/100)) WHEN has_discount=0 THEN (price_inc_vat_per_item*quantity)-(price_inc_vat_per_item*quantity)/(1+(vat_rate/100)) END ELSE 0 END 
                   )as total_vat_only
            FROM  sales_items
            WHERE  client_id='0fe26d93-775f-440c-a119-13cbcb6cbc0c'
              AND  is_removed=0
            GROUP BY  sales_id 
    ) as sales_items  ON `sales`.`id` = `sales_items`.`sales_id`
    left join  
    (
        SELECT  sales_id, SUM(payment_amount-payment_change) payment_taken,
                SUM(CASE WHEN payment_method='CASH' THEN (payment_amount-payment_change) ELSE 0 END) as payment_cash
            FROM  sales_payments
            WHERE  client_id='0fe26d93-775f-440c-a119-1396c36cbc0c'
            GROUP BY  sales_id
    ) as sales_payments  ON `sales`.`id` = `sales_payments`.`sales_id`
    WHERE  `sales`.`client_id` = '0fe26d93-775f-440c-a119-1396c36cbc0c'
      and  `sales`.`outlet_id` = 'd5b74bdf-5cef-4455-bf99-13cbcb6cbc0c'
      and  `sales`.`order_status` = 'COMPLETED'
      and  `sale_date_time` >= '2016-01-28 00:00:00'
      and  `sale_date_time` <= '2016-11-28 23:59:00'
    GROUP BY  HOUR(CONVERT_TZ(sales.sale_date_time,'UTC','Europe/London'))
    ORDER BY  `sale_date_time` ASC

UPDATE:更新:

To answer the questions by @rick-james回答@rick-james 的问题

  • I Need to sort it by sale_date_time which is a datetime field.我需要按日期时间字段 sale_date_time 对其进行排序。 Group by is needed to report by by hour.按小时报告需要分组依据。 It also has days, Month-year etc dependig on the period queried.它还有天、月-年等,取决于查询的时间段。
  • Had to use UUID because of the design.由于设计原因,不得不使用 UUID。 The whole DB is around 8GB where these four tables have most of it.整个 DB 大约为 8GB,这四个表拥有大部分。 The index length is bigger than the actual data size as I had lots of foreign key contraint.索引长度大于实际数据大小,因为我有很多外键约束。

It's on Amazon Aurora with 15GB RAM.它在 Amazon Aurora 上运行,内存为 15GB。

Sales Table: 0.5GB Data 1.3GB Index销售表:0.5GB 数据 1.3GB 索引

Sales Items: 1.3GB Data 3.2GB Index销售项目:1.3GB 数据 3.2GB 索引

Sales Payments: 0.5GB Data 1.1GB Index销售付款:0.5GB 数据 1.1GB 索引

All tables collation is utf8_unicode_ci.所有表的整理都是 utf8_unicode_ci。

  • It's using Aurora 5.6 which is MySQL 5.6.它使用的是 Aurora 5.6,即 MySQL 5.6。 Here is the explain select.这是解释选择。

ID select_type tables type possible_keys keys key_len ref rows filtered extra ID select_type 表类型 possible_keys 键 key_len ref 行过滤额外

1 PRIMARY sales ref sales_client_id_outlet_id_foreign,sales_client_id_index,sales_outlet_id_index,sales_sale_date_time_index,sales_order_status_index sales_client_id_index 108 const 5352 Using index condition; 1 PRIMARY sales ref sales_client_id_outlet_id_foreign,sales_client_id_index,sales_outlet_id_index,sales_sale_date_time_index,sales_order_status_index sales_client_id_index 108 const 5352 使用索引条件; Using where;使用哪里; Using temporary;使用临时; Using filesort使用文件排序

1 PRIMARY ref 108 MyDB.sales.id 10 1 主要参考 108 MyDB.sales.id 10

1 PRIMARY ref 108 MyDB.sales.id 10 1 主要参考 108 MyDB.sales.id 10

3 DERIVED sales_payments ref sales_payments_client_id_outlet_id_foreign,sales_payments_client_id_index sales_payments_client_id_outlet_id_foreign 108 const 5092 Using index condition; 3 DERIVED sales_payments ref sales_payments_client_id_outlet_id_foreign,sales_payments_client_id_index sales_payments_client_id_outlet_id_foreign 108 const 5092 使用索引条件; Using where;使用哪里; Using temporary;使用临时; Using filesort使用文件排序

2 DERIVED sales_items ref sales_items_client_id_outlet_id_foreign,sales_items_client_id_index sales_items_client_id_outlet_id_foreign 108 const 13340 Using index condition; 2 DERIVED sales_items ref sales_items_client_id_outlet_id_foreign,sales_items_client_id_index sales_items_client_id_outlet_id_foreign 108 const 13340 使用索引条件; Using where;使用哪里; Using temporary;使用临时; Using filesort使用文件排序

2 DERIVED products eq_ref PRIMARY,products_id_unique PRIMARY 108 MyDB.sales_items.product_id 1 2 衍生产品 eq_ref PRIMARY,products_id_unique PRIMARY 108 MyDB.sales_items.product_id 1

  • May be Will look into store the result in DB and get from there.可能会查看将结果存储在 DB 中并从那里获取。 Only problem is that the old orders can be amended and the total will need to be rebuilt if that happens.唯一的问题是可以修改旧订单,如果发生这种情况,则需要重建总数。

Any other way to rewrite the query to get the desired result?任何其他方式来重写查询以获得所需的结果?

  • When ORDER BY unnecessarily differs from GROUP BY , an extra sort pass is needed.ORDER BYGROUP BY ORDER BY不必要地不同时,需要额外的排序过程。
  • UUIDs are terribly inefficient when the data is bigger than can be cached in RAM.当数据大于 RAM 中可以缓存的数据时,UUID 的效率非常低。 How big are the tables?桌子有多大? What is the value of `innodb_buffer_pool_size? `innodb_buffer_pool_size 的值是多少? How much RAM do you have?你有多少内存?
  • LEFT JOIN ( SELECT ... ) is terribly inefficient until at least 5.6. LEFT JOIN ( SELECT ... )在至少 5.6 之前效率极低。 Please provide EXPLAIN SELECT ... to see if it is Optimized.请提供EXPLAIN SELECT ...以查看它是否已优化。 What version are you using?你用的是什么版本?
  • Even worse is LEFT JOIN ( SELECT ... ) LEFT JOIN ( SELECT ... ) .更糟糕的是LEFT JOIN ( SELECT ... ) LEFT JOIN ( SELECT ... ) Added: Since I don't see "auto-key" this is bad.补充:因为我没有看到“自动键”,这很糟糕。 It makes me wonder if it is really MySQL 5.6.这让我怀疑它是否真的是 MySQL 5.6。
  • Building and maintaining a "Summary table" may be the ultimate answer.建立和维护“汇总表”可能是最终的答案。 It would probably have a PRIMARY KEY including client_id, outlet_id, order_status, and sale_HOUR.它可能有一个PRIMARY KEY包括 client_id、outlet_id、order_status 和 sale_HOUR。
  • Does either subquery run slowly by itself?两个子查询本身是否运行缓慢? If so, start a separate question to focus on just the subquery.如果是这样,请开始一个单独的问题以仅关注子查询。 Please provide output from SHOW CREATE TABLE ;请提供SHOW CREATE TABLE输出; there are a lot of details missing from your description of the tables -- indexes, datatypes, sizes, collations, etc. Added: Still need this;您对表的描述中缺少很多细节——索引、数据类型、大小、排序规则等。补充:仍然需要这个; there are still some things to check.还有一些事情需要检查。 A possible solution: CREATE TEMPORARY TABLE with each of the two LEFT JOIN SELECTs ;一个可能的解决方案:使用两个LEFT JOIN SELECTs每一个CREATE TEMPORARY TABLE then use them.然后使用它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM