How to optimize sql query with COUNT and GROUP BY

Question

I have a table cast with about 1.5 million rows, and a smaller table watched with about 1000-2000 rows. Both tables share a column named movieId. I am trying to run this query:

SELECT actorId, COUNT( actorId )
FROM cast t1
WHERE EXISTS (    
    SELECT userId
    FROM watched t2
    WHERE t1.movieId = t2.movieId
    AND t2.userId =8
)
GROUP BY actorId

However, it is taking like 5 seconds to return the results. I a multi column index on actorId and movieId in the cast table and indices on userId and movieId in the watched table. The query returns around 20000 results. Is there any way I could optimize my query/tables, so that the query would run faster?

Answer 1

For this query:

SELECT c.actorId, COUNT(*)
FROM cast c
WHERE EXISTS (SELECT 1
              FROM watched w
              WHERE w.movieId = c.movieId AND w.userId = 8
             )
GROUP BY c.actorId;

You want an index on watched(movieId, userId) . An index on cast(movieId, actorId) might also prove useful.

Notice that I changed the table aliases to be more meaningful than arbitrary letters.

EDIT:

Given the size of the tables, I think an explicit join might be better:

SELECT c.actorId, COUNT(*)
FROM watched w JOIN
     cast c
     ON w.movieId = c.movieId
WHERE w.userId = 8
GROUP BY c.actorId;

For this query, you want indexes on watched(userId, movieId) and cast(movieId, actorId) . This version assumes you don't have duplicate rows in watched .

Answer 2

perhaps using an inner join instead of an exists will give you better performance. Assuming movieId and userId are indexed, try inner joining to watched using the filters in your nested where clause:

Select .....
From 
  cast c inner join watched w
  On w.movieid = c.movieid
  And w.userid = 8
Group by ....

.

The above, in theory, should be a less expensive operation as each record isn't tested in an exists clause.

Please excuse the lack of styling, I'm posting from an iPad.

How to optimize sql query with COUNT and GROUP BY

Question

2 answers

solution1
3 ACCPTED 2017-10-08 02:14:50

solution2
2 2017-10-08 02:22:08

How to optimize sql query with COUNT and GROUP BY

Question

2 answers

solution1 3 ACCPTED 2017-10-08 02:14:50

solution2 2 2017-10-08 02:22:08

solution1
3 ACCPTED 2017-10-08 02:14:50

solution2
2 2017-10-08 02:22:08