[英]MySQL query runs very slow
I have a MySQL (v5.7.26) query that runs forever.我有一个永远运行的 MySQL (v5.7.26) 查询。 Here is the query:这是查询:
SELECT
ur.user_id AS user_id,
sum(r.duration) AS total_time,
count(user_id) AS number_of_workouts
FROM user_resource ur
INNER JOIN resource r ON r.id = ur.resource_id
WHERE
ur.status = 1
AND NOT ur.action_date IS NULL
AND ur.user_id IN (
SELECT user_id
FROM user_resource ur2
WHERE ur2.action_date >= now() - INTERVAL 2 DAY
)
AND r.type = 'WORKOUT'
GROUP BY ur.user_id;
I have played a bit with it, by trying to understand where is the problem.通过试图了解问题出在哪里,我已经玩了一点。 For the testing purposes, I tried breaking in two.出于测试目的,我尝试将其一分为二。 So:所以:
SELECT user_id
FROM user_resource ur2
WHERE ur2.action_date >= now() - INTERVAL 2 DAY;
That returns (very quickly) list of user user_id's.这会(非常快地)返回用户 user_id 的列表。 When I plug the returned result in to the first part of the query, like this:当我将返回的结果插入查询的第一部分时,如下所示:
SELECT
ur.user_id AS user_id,
sum(r.duration) AS total_time,
count(user_id) AS number_of_workouts
FROM user_resource ur
INNER JOIN resource r ON r.id = ur.resource_id
WHERE
ur.status = 1
AND NOT ur.action_date IS NULL
AND ur.user_id IN (1,1,1,4,4,5,6,7,7,7);
AND r.type = 'WORKOUT'
GROUP BY ur.user_id
It runs very fast.它运行得非常快。 My assumption is the IN (Subquery) is the bottleneck.我的假设是 IN(子查询)是瓶颈。
I was thinking to extract the subquery and get the user_ids, and then used it as a variable, but I am not sure is it the good approach, and additionally I am having issues with it.我想提取子查询并获取 user_ids,然后将其用作变量,但我不确定这是不是好方法,另外我遇到了问题。 this is my attempt:这是我的尝试:
-- first statement
SET @v1 = (SELECT user_id
FROM user_resource ur2
WHERE ur2.action_date >= now() - INTERVAL 2 DAY)
-- second statement
SELECT
ur.user_id AS user_id,
sum(r.duration) AS total_time,
count(user_id) AS prefixes
FROM user_resource ur
INNER JOIN resource r ON r.id = ur.resource_id
WHERE
ur.status = 1
AND NOT ur.action_date IS NULL
AND ur.user_id IN (@v1);
AND r.type = 'WORKOUT'
GROUP BY ur.user_id
Problem here is that the first statement returns an error:这里的问题是第一条语句返回错误:
Subquery returns more than 1 row.子查询返回超过 1 行。
Expected result are user_id's, that can be duplicates.预期结果是 user_id,可以是重复的。 And I need those duplicated for the count.我需要那些重复的计数。
How can I fix this?我怎样才能解决这个问题?
Try EXISTS
instead of IN
尝试EXISTS
而不是IN
...
AND EXISTS (SELECT *
FROM user_resource ur2
WHERE ur2.user_id = ur.user_id
AND ur2.action_date >= now() - INTERVAL 2 DAY)
...
and indices on user_resource (user_id, action_date)
, user_resource (status, action_date, user_id)
and/or user_resource (type)
.和user_resource (user_id, action_date)
、 user_resource (status, action_date, user_id)
和/或user_resource (type)
user_resource (status, action_date, user_id)
。
You could try:你可以试试:
-- first statement
SET @v1 = (SELECT GROUP_CONCAT(user_id)
FROM user_resource ur2
WHERE ur2.action_date >= now() - INTERVAL 2 DAY)
-- second statement
SELECT
ur.user_id AS user_id,
sum(r.duration) AS total_time,
count(user_id) AS prefixes
FROM user_resource ur
INNER JOIN resource r ON r.id = ur.resource_id
WHERE ur.status = 1 AND NOT ur.action_date IS NULL AND FIND_IN_SET(ur.user_id,@v1)
AND r.type = 'WORKOUT'
GROUP BY ur.user_id
Additional join will be faster then sub-query:附加连接将比子查询更快:
SELECT
ur.user_id AS user_id,
sum(r.duration) AS total_time,
count(user_id) AS number_of_workouts
FROM user_resource ur
INNER JOIN resource r ON r.id = ur.resource_id
INNER JOIN (
SELECT user_id
FROM user_resource ur2
WHERE ur2.action_date >= now() - INTERVAL 2 DAY
) t ON t.user_id = ur.user_id
WHERE
ur.status = 1
AND NOT ur.action_date IS NULL
AND r.type = 'WORKOUT'
GROUP BY ur.user_id;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.