简体   繁体   English

MySQL GROUP BY花费很多时间来获取记录

[英]MySQL GROUP BY taking so much time to fetch records

I want to query the database to fetch the last visit time of every user here is the query: 我想查询数据库以获取每个用户的最后访问时间,这里是查询:

SELECT 
u.user_id,
u.firstname,
u.lastname,
u.email,
pv.visit_time 
FROM
  users u 
LEFT OUTER JOIN pageviews pv 
    ON u.user_id = pv.user_id 
   GROUP BY pv.user_id 
LIMIT 0, 12 

This query is taking 30 to 40 seconds to execute on live server, however if i remove the GROUP BY clause then it is taking 3 to 6 seconds but with duplicate records. 此查询需要30到40秒才能在实时服务器上执行,但是,如果我删除GROUP BY子句,则需要3到6秒,但记录重复。 Any idea what's wrong with this query? 知道此查询有什么问题吗?

Also i have tried DISTINCT but found same issue. 我也尝试过DISTINCT,但是发现了同样的问题。 Thanks, any help would be appreciated. 谢谢,任何帮助将不胜感激。

The group by clause and distinct requires a full scan of the table. group by子句和distinct需要对表进行全面扫描。

Maybe the query without the group by clause can be faster in returning the first rows, have you checked how long it takes to retrieve the whole result set? 也许没有group by子句的查询返回第一行的速度更快,您是否检查了检索整个结果集所花费的时间?

If it takes only 3-6 seconds, I would refresh the statistics, maybe the optimiser is not doing the best choices for the join (I imagine that the table pageviews is a large one). 如果只用3到6秒,我将刷新统计信息,也许优化器没有为联接做最佳选择(我想表的页面浏览量很大)。

What are your indexes? 你的索引是什么?

Do you really want a left join, as that would seem irrelevant? 您是否真的想要左联接,因为这似乎无关紧要? Using a LEFT OUTER JOIN it would just seem that you are going to get a row for user_id of NULL, but with nulls also in the other columns. 使用LEFT OUTER JOIN看来,您将获得user_id为NULL的行,但其他列中也为null。

Further you are using GROUP BY to return a single row for each user. 此外,您正在使用GROUP BY为每个用户返回一行。 However which row is returned is not defined, so it could be any page views visit_time that is brought back for a user. 但是,返回的行未定义,因此可以是返回给用户的任何页面浏览visit_time。

Also you have only a single column in the GROUP BY clause but other non aggregate columns in the select. 另外,在GROUP BY子句中只有一列,而在select中只有其他非聚合列。 With default options in MySQL this will work but will not work in most flavours of SQL and will also not work when MySQL is performing the group by in strict mode (see this manual page ). 使用MySQL中的默认选项时,此功能将起作用,但不适用于大多数SQL版本,并且在MySQL以严格模式执行分组依据时也将不起作用(请参见本手册页 )。

Add a index on u.user_id and a compound index on pv.user_id AND pv.visit_time. 在u.user_id上添加索引,在pv.user_id和pv.visit_time上添加复合索引。 Then assuming you want the latest visit time for each user try the query as:- 然后,假设您希望每个用户的最新访问时间,请尝试按以下方式查询:

SELECT u.user_id,
    u.firstname,
    u.lastname,
    u.email,
    MAX(pv.visit_time)
FROM users u 
INNER JOIN pageviews pv 
ON u.user_id = pv.user_id 
GROUP BY u.user_id, u.firstname, u.lastname, u.email
ORDER BY u.user_id
LIMIT 0, 12

(strictly speaking the ORDER BY clause is not required as it is implicitly done by the GROUP BY clause, but it does make it more explicit what is expected to anyone reading the code in future). (严格来说,不需要ORDER BY子句,因为它是由GROUP BY子句隐式完成的,但是它的确使它对于以后阅读该代码的任何人都更加明确了)。

Select t1.x, t1.y, t1.z from table1 t1 Group by t1.x,t1.y,t1.z.... 

It will give better performance dude... 它将提供更好的性能伙计...

Group by fields (x,y,z) should be appended with select statement to get better performance.. Group by字段Group by (x,y,z)应该附加选择语句以获得更好的性能。

Try it ...(group by operation will happen with in result set for above query) 试试看...(按操作分组将发生上述查询的结果集)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM