简体   繁体   English

如何使此MySQL SELECT + GROUP BY查询更有效?

[英]How can I make this MySQL SELECT + GROUP BY query more efficient?

I have a fairly popular site that's now getting rammed with a lot of traffic, and I've been informed by my webhost that the following query is taking up to 2 seconds to run. 我有一个相当受欢迎的网站,现在流量非常大,我的网络托管商通知我以下查询最多需要2秒钟才能运行。 My MySQL skills aren't that great, so I'm sure I'm doing something wrong, but I'm not sure what could be done to improve it. 我的MySQL技能不是很好,所以我确定我做错了什么,但是我不确定可以做些什么来改善它。

For simplicity's sake, assume live_blueprints is a table with four fields: 为简单起见,假设live_blueprints是一个包含四个字段的表:

  • isSolved [tinyint(1)] isSolved [tinyint(1)]
  • levelSlug [varchar(128)] levelSlug [varchar(128)]
  • solution [varchar(255)] 解决方案[varchar(255)]
  • trackCount [mediumint(7)] trackCount [mediumint(7)]

I realize using a string (levelSlug) instead of an int (id) is a probably bad idea, so that's one of the things I'd like to fix. 我意识到使用字符串(levelSlug)代替int(id)是一个坏主意,所以这是我要解决的问题之一。 Basically what I'm trying to do with this query is grab the top 49 blueprints with unique solution strings. 基本上,我要使用此查询执行的操作是获取具有唯一解决方案字符串的前49个蓝图。 The live_blueprints table has ~550k rows, and I think that's the main cause of the problem. live_blueprints表具有约55万行,我认为这是问题的主要原因。 The way I understand it, is that the way this is written, it'll check all 550k rows, and then group them, and then chop off the top 49 to give to me... I'm just wondering if there's a way I could do this without it having to do so much work on the rows... Perhaps even by creating a second table of just "unique" solutions. 我的理解方式是,编写该方法时,它将检查所有550k行,然后将它们分组,然后将前49位砍去给我...我只是想知道是否有办法我可以做到这一点,而不必在行上做很多工作……甚至通过创建仅“唯一”解决方案的第二张表也可以。

Anyway, here's the query right now: 无论如何,这是现在的查询:

SELECT * 
  FROM live_blueprints
 WHERE levelSlug = 'someLevelSlug' 
    && isSolved = 1 
GROUP BY solution 
ORDER BY trackCount ASC 
   LIMIT 49

Thanks for whatever help or insight you can provide! 感谢您提供的任何帮助或见解!

Ok, so to answer some questions: 好的,所以回答一些问题:

The only indexes on the table are on id and levelSlug. 该表上唯一的索引位于id和levelSlug上。 For starters, I'm going to add an index on solution. 首先,我将在解决方案上添加一个索引。

I did an explain, so I think this is what you're looking for, levelID is the index for levelSlug. 我做了一个解释,所以我认为这就是您想要的,levelID是levelSlug的索引。

id > 1
select_type > SIMPLE
table > live_blueprints
type > ref
possible_keys > levelID
key > levelID
key_len > 386
ref > const
rows > 4407
Extra > Using where; Using temporary; Using filesort

What kind of indexes do you have on your table? 您的表上有什么样的索引?

Cause, having an index on solution, solutionCoolness (in that order) should help a bit here. 原因是,在solution, solutionCoolness上有一个索引solution, solutionCoolness (按此顺序)应该在这里有所帮助。

With the where clauses you could even use an index with the columns levelSlug, isSolved, solution, solutionCoolness in that order to make it a little faster. 使用where子句,您甚至可以将索引与列levelSlug, isSolved, solution, solutionCoolness一起使用,以使其更快一点。

Either way, we need to know which indexes you have and it would help to see the explain of the query. 无论哪种方式,我们都需要知道您拥有哪些索引,这将有助于查看查询的explain

How about adding another column, lets call it ranking its gonna be only for your use. 如何添加另一列,让我们对其进行排名只能供您使用。 When adding a new solution if it already exists make this column zero, else insert 1 or something else like the track count. 添加新解决方案(如果已经存在)时,请将该列设置为零,否则请插入1或其他类似跟踪计数的内容。 This way you can get rid of the group by and just short by the trackcount where the ranking is not 0. 这样,您可以摆脱排名不为0的trackcount的分组,而仅靠其短。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM