简体   繁体   English

如何使用多个表优化SQL查询

[英]How to optimize a SQL query using multiple tables

I have this SQL query here that grabs the 5 latest news posts. 我在这里有此SQL查询,可获取5个最新新闻。 I want to make it so it also grabs the total likes and total news comments in the same query. 我想做到这一点,因此它也可以在同一查询中获得总的点赞数和总的新闻评论数。 But the query I made seems to be a little slow when working with large amounts of data so I am trying to see if I can find a better solution. 但是,在处理大量数据时,我进行的查询似乎有些慢,因此我尝试查看是否可以找到更好的解决方案。 Here it is below: 在下面:

SELECT *, 
`id` as `newscode`, 
(SELECT COUNT(*) FROM `likes` WHERE `type`="newspost" AND `code`=`newscode`) as `total_likes`,
(SELECT COUNT(*) FROM `news_comments` WHERE `post_id`=`newscode`) as `total_comments`
FROM `news` ORDER BY `id` DESC LIMIT 5

Here is a SQLFiddle as well: http://sqlfiddle.com/#!2/d3ecbf/1 这里也是一个SQLFiddle: http ://sqlfiddle.com/#!2/d3ecbf/1

I would recommend adding a total_likes and total_comments fields to the news table which gets incremented/decremented whenever a like and/or comment is added or removed. 我建议向news表添加一个total_likestotal_comments字段,每当添加或删除一个total_likes和/或评论时,它就会递增/递减。

Your likes and news_comments tables should be used for historical purposes only. likesnews_comments表应只用于历史目的。

This strenuous counting should not be performed every time a page is loaded because that is a complete waste of resources. 不应在每次加载页面时执行这种繁琐的计数,因为这是对资源的完全浪费。

首先检查IDpost_idtype,code列的帮助索引。

I assume this is T-SQL, as that is what I am most familiar with. 我认为这是T-SQL,因为这是我最熟悉的。

First I would check indexes. 首先,我将检查索引。 If that looks good, then I'd check statement. 如果那看起来不错,那么我会检查一下语句。 Take a look at your query map to see how it's populating your result. 查看您的查询地图,看看它如何填充您的结果。

SQL works backward, so it starts with your last AND statement and goes from there. SQL向后工作,因此它从您的最后一个AND语句开始,然后从那里开始。 It'll group them all by code, and then type, and finally give you a count. 它将按代码将它们全部分组,然后键入,最后给您一个计数。

Right now, you're grabbing everything with certain codes, regardless of date. 现在,无论日期如何,您都可以使用某些代码来获取所有内容。 When you stated that you want the latest, I assume there is a date column somewhere. 当您说要更新时,我假设某处有一个日期列。

In order to speed things up, add another AND to your WHERE and account for the date. 为了加快速度AND在您的WHERE添加另一个AND ,并输入日期。 Either last 24 hours, last week, whatever. 不管是过去24小时,还是上周。

You could rewrite this using joins, MySQL has known issues with subqueries, especially when dealing with large data sets: 您可以使用联接来重写它,MySQL存在子查询的已知问题,尤其是在处理大型数据集时:

SELECT  n.*, 
        `id` as `newscode`, 
        COALESCE(l.TotalLikes, 0) AS `total_likes`,
        COALESCE(c.TotalComments, 0) AS `total_comments`
FROM    `news` n
        LEFT JOIN
        (   SELECT  Code, COUNT(*) AS TotalLikes
            FROM    `likes` 
            WHERE   `type` = "newspost" 
            GROUP BY Code
        ) AS l
            ON l.`code` = n.`id`
        LEFT JOIN
        (   SELECT  post_id, COUNT(*) AS TotalComments
            FROM    `news_comments` 
            GROUP BY post_id
        ) AS c
            ON c.`post_id` = n.`id`
ORDER BY n.`id` DESC LIMIT 5;

The reason is that when you use a join as above, MySQL will materialise the results of the subquery when it is first needed, eg at the start of this query, mySQL will put the results of: 原因是,当您按上述方式使用联接时,MySQL将在首次需要时具体化子查询的结果,例如,在此查询开始时,mySQL将放置以下结果:

SELECT  post_id, COUNT(*) AS TotalComments
FROM    `news_comments` 
GROUP BY post_id

into an in memory table and hash post_id for faster lookups. 放入内存表中,并哈希post_id以加快查找速度。 Then for each row in news it only has to look up TotalComments from this hashed table, when you use a correlated subquery it will execute the query once for each row in news , which when news is large will result in a large number of executions. 然后,对于news每一行,只需从此哈希表中查找TotalComments ,当您使用相关子查询时,它将对news每一行执行一次查询,当news较大时,将导致大量执行。 If the initial result set is small you may not see a performance benefit and it may be worse. 如果初始结果集很小,您可能看不到性能提升,而且可能会更糟。

Examples on SQL Fiddle SQL小提琴上的示例

Finally, you may want to index the relevant fields in news_comments and likes . 最后,您可能希望索引news_commentslikes的相关字段。 For this particular query I think the following indexes will help: 对于这个特定的查询,我认为以下索引会有所帮助:

CREATE INDEX IX_Likes_Code_Type ON Likes (Code, Type);
CREATE INDEX IX_newcomments_post_id ON news_comments (post_id);

Although you may need to split the first index into two: 尽管您可能需要将第一个索引分成两个部分:

CREATE INDEX IX_Likes_Code ON Likes (Code);
CREATE INDEX IX_Likes_Type ON Likes (Type);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM