MySQL：通过三个联接优化查询

Question

First thing's first: what I am doing works perfectly fine. 第一件事是第一件事：我所做的一切都很好。 I'm just seeing if there is any room for improvements, and if how I'm doing things is standard and/or using good practices. 我只是在查看是否有任何改进的余地，并且我的工作方式是否是标准的和/或使用良好的做法。

These are the tables in question: 这些是有问题的表：

item
topic
item_topic
item_like_audit . item_like_audit 。

This is my use case: 这是我的用例：

There are topic 's that can contain many item 's. 有些topic可以包含许多item 。
Each item can have N amount of likes on them. 每个item可以包含N个赞。
For each like, a record is stored in the item_like_audit table, such that is can be queried at a later time for ranking purposes. 对于每个喜欢，记录存储在item_like_audit表中，以便稍后可以查询以进行排名。

This is what the query is trying to achieve: 这是查询要达到的目的：

Get all items under a certain topic that received the most likes within the past 7 days. 获取特定主题下在过去7天内获得最多关注的所有商品。

Can the following query or underlying schema be improved in any way (for performance or memory gains)? 可以以任何方式（为了提高性能或增加内存）改善以下查询或基础架构吗？

Query: 查询：

SELECT DISTINCT item.* FROM item

/* Match items under this specific topic */
JOIN topic
    ON topic.slug = ?
    AND topic.deleted_at IS NULL
JOIN item_topic
    ON item_topic.item_id = item.id
    AND item_topic.topic_id = topic.id
    AND item_topic.deleted_at IS NULL

/* Match items that have had "like" activity in the past 7 days */
JOIN item_like_audit
    ON item_like_audit.item_id = item.id
    AND item_like_audit.created_at <= (CURRENT_DATE + INTERVAL 7 DAY)
WHERE item.deleted_at IS NULL

/* Order by highest like count to lowest */
ORDER BY item.like_count DESC

/* Pagination */
LIMIT ? OFFSET ?

Schema: 架构：

CREATE TABLE item (
    id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,

    name VARCHAR(255) NOT NULL,
    slug VARCHAR(255) NOT NULL UNIQUE,
    tagline VARCHAR(255) NOT NULL,
    description VARCHAR(1000) NOT NULL,
    price FLOAT NOT NULL,
    like_count INT(10) NOT NULL DEFAULT 0,
    images VARCHAR(1000) NOT NULL,

    created_at TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
    deleted_at TIMESTAMP NULL DEFAULT NULL,

    PRIMARY KEY (id)
);

CREATE TABLE item_like_audit (
    id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,

    item_id INT(10) UNSIGNED NOT NULL,
    user_id INT(10) UNSIGNED NOT NULL,

    created_at TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,

    PRIMARY KEY (id),
    KEY `item_like_audit_created_at_index` (`created_at`)
);

CREATE TABLE topic (
    id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,

    name VARCHAR(255) NOT NULL,
    slug VARCHAR(255) NOT NULL UNIQUE,

    created_at TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
    deleted_at TIMESTAMP NULL DEFAULT NULL,

    PRIMARY KEY (id)
);

CREATE TABLE item_topic (
    id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,

    item_id INT(10) NOT NULL,
    topic_id INT(10) NOT NULL,

    created_at TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
    deleted_at TIMESTAMP NULL DEFAULT NULL,

    PRIMARY KEY (id)
);

Answer 1

Since you are only returning Item records, you could try this for possible improved performance: 由于仅返回项目记录，因此可以尝试执行以下操作以提高性能：

select Item.* 
  from Item
 where Item.deleted_at is null
   and exists (select 1 from item_topic
                where item_topic.item_id = item.id
                  and itme_topic.deleted_at is null
                  and exists (select 1 from topic
                               where topic.id = item_topic.item_id
                                 and topic.deleted_at is null
                                 and topic.slug = ?))
   and exists (select 1 from item_like_audit
                where item_like_audit.item_id = item.id
                  and item_liek_audit.created_at >= (current_date - interval 7 day))
 order by Item.like_count desc

This can potentially improve performance since: 这可能会提高性能，因为：

You don't need the DISTINCT operator 您不需要DISTINCT运算符
The Database only has to find 1 row from each supporting table that matches the constraints instead of all matching records. 数据库只需从每个支持表中找到与约束匹配的一行，而不是所有匹配的记录。

Answer 2

Assuming item_topic(item_id,topic_id) is unique, we could do away with the "Using filesort" operation by getting rid of the DISTINCT keyword, and rewriting the check of item_like_audit as an EXISTS correlated subquery instead of a JOIN operation. 假设item_topic(item_id,topic_id)是唯一的，我们可以通过摆脱DISTINCT关键字来取消“使用文件排序”操作，并将item_like_audit的检查重写为EXISTS相关子查询而不是JOIN操作。

We'd have a guarantee of the uniqueness if we had 如果我们有，我们将保证唯一性

  CREATE UNIQUE INDEX item_topic_UX1 ON item_topic (topic_id, item_id);

We already have guarantees of uniqueness for topic(slug) , topic(id) , item(id) , ... 我们已经保证topic(slug) ， topic(id) ， item(id) ，...的唯一性

  SELECT item.* 
    FROM item

/* Match items under this specific topic */
    JOIN item_topic
      ON item_topic.item_id = item.id
     AND item_topic.deleted_at IS NULL
    JOIN topic
      ON topic.id    = item_topic.topic_id
     AND topic.slug  = ?
     AND topic.deleted_at IS NULL

   WHERE item.deleted_at IS NULL
/* Match items that have had "like" activity in the past 7 days */
     AND EXISTS ( SELECT 1
                    FROM item_like_audit
                   WHERE item_like_audit.item_id = item.id
                     AND item_like_audit.created_at >= DATE(NOW()) + INTERVAL -7 DAY
                 )

/* Order by highest like count to lowest */
  ORDER BY item.like_count DESC

For improved performance of the correlated subquery, we could create a covering index 为了提高相关子查询的性能，我们可以创建覆盖索引

  CREATE INDEX item_like_audit_IX1 ON item_like_audit (item_id, created_at)

We expect the unique index we created earlier will be used for the join operation, so that should also improve performance. 我们希望我们之前创建的唯一索引将用于联接操作，因此也应提高性能。 We could get a covering index if we included deleted_at column 如果我们包含deleted_at列，我们可以获得覆盖指数

  CREATE INDEX item_topic_IX2 ON item_topic (topic_id, item_id, deleted_at)

That is redundant with the unique index we created earlier, if we still want to guarantee uniqueness, flip the order of the columns around... 这与我们之前创建的唯一索引是多余的，如果我们仍然要保证唯一性，请翻转列的顺序...

  DROP INDEX item_topic_UX1 ON item_topic ;
  CREATE UNIQUE INDEX item_topic_UX1 ON item_topic (item_id,topic_id);

If we don't have guaranteed uniqueness, then I would favor adding a GROUP BY item.id clause over a DISTINCT keyword. 如果我们不能保证唯一性，那么我宁愿在DISTINCT关键字上添加GROUP BY item.id子句。

Use EXPLAIN to see the execution plan, and verify that appropriate indexes are being used. 使用EXPLAIN查看执行计划，并验证是否正在使用适当的索引。

If we can't guarantee uniqueness of (item_id,topic_id) from item_topic , and the overhead of the "Using filesort" operation for the GROUP BY operation is still too high, 如果我们不能保证唯一性(item_id,topic_id)从item_topic ，并为“使用文件排序”的运作开销GROUP BY操作仍然过高，

We could try checking the "matching topic" condition using an EXISTS. 我们可以尝试使用EXISTS检查“匹配主题”条件。 (But I don't hold out much hope that this will be any faster.) （但我并不希望这会更快。）

  SELECT item.*
    FROM item
   WHERE item.deleted_at IS NULL
     AND EXISTS ( SELECT 1
                    FROM topic
                    JOIN item_topic
                      ON item_topic.item_id    = item.id
                     AND item_topic.topic_id   = topic.id
                     AND item_topic.deleted_at IS NULL
                    JOIN item_like_audit 
                      ON item_like_audit = item.id
                     AND item_like_audit.created_at >= DATE(NOW()) + INTERVAL -7 DAY 
                   WHERE topic.slug  = ?
                     AND topic.deleted_at IS NULL
                )
  ORDER BY item.like_count DESC

We are going to need to have suitable indexes available for performance of the correlated subquery. 我们将需要具有合适的索引以用于相关子查询的性能。

MySQL：通过三个联接优化查询

问题描述

2 个解决方案

解决方案1
1 2017-11-08 16:53:10

解决方案2
1 已采纳 2017-11-08 16:56:23

MySQL：通过三个联接优化查询

问题描述

2 个解决方案

解决方案1 1 2017-11-08 16:53:10

解决方案2 1 已采纳 2017-11-08 16:56:23

解决方案1
1 2017-11-08 16:53:10

解决方案2
1 已采纳 2017-11-08 16:56:23