I have a very slow MySQL query that I would like to optimise.
The query is taking 66.2070 seconds to return 5 results from tables containing around 200 rows.
The database tables store users
, experiments
(A/B tests), goals
(page URLs), visits
(page visits) and conversions
(clicks a goal's URL). The visit
and conversion
tables both have a combination
column that records if version A or B of a page was visited or a conversion came from version A or B. Combinations are stored in the db as 1
or 2
.
I'm trying to get a list of a user's experiments with the number of visits and conversions for each combination.
For some relationships I'm using composite primary keys, which does make the joins more complicated. I doubt it but could this be the cause of the problem?
How can I rewrite this query to make it run in a reasonable time, at least less than a second?
Here's my database schema:
and her's my query:
SELECT e.id AS id,
e.name AS name,
e.status AS status,
e.created AS created,
Count(DISTINCT v1.id) AS visits1,
Count(DISTINCT v2.id) AS visits2,
Count(DISTINCT c1.id) AS conversions1,
Count(DISTINCT c2.id) AS conversions2
FROM experiment e
LEFT JOIN visit v1
ON ( v1.experiment_id = e.id
AND v1.user_id = e.user_id
AND v1.combination = 1 )
LEFT JOIN visit v2
ON ( v2.experiment_id = e.id
AND v2.user_id = e.user_id
AND v2.combination = 2 )
LEFT JOIN goal g
ON ( g.experiment_id = e.id
AND g.user_id = e.user_id
AND g.principal = 1 )
LEFT JOIN conversion c1
ON ( c1.experiment_id = e.id
AND c1.user_id = e.user_id
AND c1.goal_id = g.id
AND c1.combination = 1 )
LEFT JOIN conversion c2
ON ( c2.experiment_id = e.id
AND c2.user_id = e.user_id
AND c2.goal_id = g.id
AND c2.combination = 2 )
WHERE e.user_id = 25
GROUP BY e.id
ORDER BY e.created DESC
LIMIT 5
The resulting table should look something like this:
You should do the aggregations before doing the joins, to avoid getting large intermediate results. I think the logic is
SELECT e.id, e.name, e.status, e.created,
v.visits1, v.visits2, g.conversions1, g.conversions2
FROM experiment e LEFT JOIN
(SELECT experiment_id, user_id,
SUM(combination = 1) as visits1,
SUM(combination = 2) as visits2
FROM visits
WHERE combination IN (1, 2)
GROUP BY experiment_id, user_id
) v
ON v.experiment_id = e.id AND
v.user_id = e.user_id LEFT JOIN
(SELECT g.experiment_id, g.user_id,
SUM(c.combination = 1) as conversions1,
SUM(c.combination = 2) as conversions2
FROM goal g LEFT JOIN
conversion c
ON c.experiment_id = g.experiment_id AND
c.user_id = g.user_id AND
c.goal_id = g.id
WHERE g.principal = 1
GROUP BY g.experiment_id, g.user_id
) g
ON g.experiment_id = e.id AND
g.user_id = e.user_id LEFT JOIN
WHERE e.user_id = 25
ORDER BY e.created DESC
LIMIT 5 ;
There are further optimizations for this. For instance, an index on experiment(user_id, created, id)
.
For your question about the drawback of using composite keys I found this:
I can't currently test ur database but use the EXPLAIN
syntax in mysql to find out what is wrong with the perfomance of ur query:
MySQL docs about EXPLAIN and optimizing ur query with EXPLAIN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.