简体   繁体   中英

CASE query optimization

SELECT
  COUNT(CASE WHEN VALUE = 1 THEN 1 END) AS score_1,
  COUNT(CASE WHEN VALUE = 2 THEN 1 END) AS score_2,
  COUNT(CASE WHEN VALUE = 3 THEN 1 END) AS score_3,
  COUNT(CASE WHEN VALUE = 4 THEN 1 END) AS score_4,
  COUNT(CASE WHEN VALUE = 5 THEN 1 END) AS score_5,
  COUNT(CASE WHEN VALUE = 6 THEN 1 END) AS score_6,
  COUNT(CASE WHEN VALUE = 7 THEN 1 END) AS score_7,
  COUNT(CASE WHEN VALUE = 8 THEN 1 END) AS score_8,
  COUNT(CASE WHEN VALUE = 9 THEN 1 END) AS score_9,
  COUNT(CASE WHEN VALUE = 10 THEN 1 END) AS score_10
FROM
  `answers`
WHERE
`created_at` BETWEEN '2017-01-01 00:00:00' AND '2019-11-30 23:59:59' 

Is there a way to optimize this query, because I have 4 million answer records in my DB, and it runs very slowly?

Try running this one time to create an index:

CREATE INDEX ix_ca on answers(created_at)

That should speed your query up. If you are curious about why, see here:

What is an index in SQL?

You could try add a redundant composite index

create idx1 on table answers(created_at, value)

using redudance in index the query should be result without accessing to table data just using the index content

Want it to be 10 times as fast? Use the Data Warehousing technique of buiding and maintaining a "Summary table". In this example the summary table might be

CREATE TABLE subtotals (
    dy DATE NOT NULL,
    `value` ... NOT NULL,   -- TINYINT UNSIGNED ?
    ct SMALLINT UNSIGNED NOT NULL, -- this is 2 bytes, max 65K; change if might be bigger
    PRIMARY KEY(value, dy)  -- or perhaps the opposite order
) ENGINE=InnoDB

Each night you summarize the day's data and build 10 new rows in subtotals .

Then the "report" query becomes

SELECT
  SUM(CASE WHEN VALUE = 1 THEN ct END) AS score_1,
  SUM(CASE WHEN VALUE = 2 THEN ct END) AS score_2,
  SUM(CASE WHEN VALUE = 3 THEN ct END) AS score_3,
  SUM(CASE WHEN VALUE = 4 THEN ct END) AS score_4,
  SUM(CASE WHEN VALUE = 5 THEN ct END) AS score_5,
  SUM(CASE WHEN VALUE = 6 THEN ct END) AS score_6,
  SUM(CASE WHEN VALUE = 7 THEN ct END) AS score_7,
  SUM(CASE WHEN VALUE = 8 THEN ct END) AS score_8,
  SUM(CASE WHEN VALUE = 9 THEN ct END) AS score_9,
  SUM(CASE WHEN VALUE = 10 THEN ct END) AS score_10
FROM
  `subtotals`
WHERE `created_at` >= '2017-01-01'
  AND `created_at`  < '2019-12-01'

Based on what you have provided, there will be about 10K rows in subtotals ; that's a lot less to wade through than 4M rows. It might run more than 10 times as fast.

More discussion: http://mysql.rjweb.org/doc.php/summarytables

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM