简体   繁体   English

CASE查询优化

[英]CASE query optimization

SELECT
  COUNT(CASE WHEN VALUE = 1 THEN 1 END) AS score_1,
  COUNT(CASE WHEN VALUE = 2 THEN 1 END) AS score_2,
  COUNT(CASE WHEN VALUE = 3 THEN 1 END) AS score_3,
  COUNT(CASE WHEN VALUE = 4 THEN 1 END) AS score_4,
  COUNT(CASE WHEN VALUE = 5 THEN 1 END) AS score_5,
  COUNT(CASE WHEN VALUE = 6 THEN 1 END) AS score_6,
  COUNT(CASE WHEN VALUE = 7 THEN 1 END) AS score_7,
  COUNT(CASE WHEN VALUE = 8 THEN 1 END) AS score_8,
  COUNT(CASE WHEN VALUE = 9 THEN 1 END) AS score_9,
  COUNT(CASE WHEN VALUE = 10 THEN 1 END) AS score_10
FROM
  `answers`
WHERE
`created_at` BETWEEN '2017-01-01 00:00:00' AND '2019-11-30 23:59:59' 

Is there a way to optimize this query, because I have 4 million answer records in my DB, and it runs very slowly?有没有办法优化这个查询,因为我的数据库中有 400 万条回答记录,而且运行速度很慢?

Try running this one time to create an index:尝试运行一次以创建索引:

CREATE INDEX ix_ca on answers(created_at)

That should speed your query up.这应该会加快您的查询速度。 If you are curious about why, see here:如果你想知道为什么,请看这里:

What is an index in SQL? SQL 中的索引是什么?

You could try add a redundant composite index您可以尝试添加冗余复合索引

create idx1 on table answers(created_at, value)

using redudance in index the query should be result without accessing to table data just using the index content在索引中使用 redudance 查询应该是结果,而无需仅使用索引内容访问表数据

Want it to be 10 times as fast?想要它快 10 倍吗? Use the Data Warehousing technique of buiding and maintaining a "Summary table".使用构建和维护“汇总表”的数据仓库技术。 In this example the summary table might be在这个例子中,汇总表可能是

CREATE TABLE subtotals (
    dy DATE NOT NULL,
    `value` ... NOT NULL,   -- TINYINT UNSIGNED ?
    ct SMALLINT UNSIGNED NOT NULL, -- this is 2 bytes, max 65K; change if might be bigger
    PRIMARY KEY(value, dy)  -- or perhaps the opposite order
) ENGINE=InnoDB

Each night you summarize the day's data and build 10 new rows in subtotals .每天晚上,您汇总当天的数据并在subtotals建立 10 个新行。

Then the "report" query becomes然后“报告”查询变为

SELECT
  SUM(CASE WHEN VALUE = 1 THEN ct END) AS score_1,
  SUM(CASE WHEN VALUE = 2 THEN ct END) AS score_2,
  SUM(CASE WHEN VALUE = 3 THEN ct END) AS score_3,
  SUM(CASE WHEN VALUE = 4 THEN ct END) AS score_4,
  SUM(CASE WHEN VALUE = 5 THEN ct END) AS score_5,
  SUM(CASE WHEN VALUE = 6 THEN ct END) AS score_6,
  SUM(CASE WHEN VALUE = 7 THEN ct END) AS score_7,
  SUM(CASE WHEN VALUE = 8 THEN ct END) AS score_8,
  SUM(CASE WHEN VALUE = 9 THEN ct END) AS score_9,
  SUM(CASE WHEN VALUE = 10 THEN ct END) AS score_10
FROM
  `subtotals`
WHERE `created_at` >= '2017-01-01'
  AND `created_at`  < '2019-12-01'

Based on what you have provided, there will be about 10K rows in subtotals ;根据您提供的内容, subtotals中将有大约 10K 行; that's a lot less to wade through than 4M rows.这比 4M 行要少得多。 It might run more than 10 times as fast.它的运行速度可能快 10 倍以上。

More discussion: http://mysql.rjweb.org/doc.php/summarytables更多讨论: http : //mysql.rjweb.org/doc.php/summarytables

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM