简体   繁体   中英

How can i speed up this django orm generated query?

I have a Movie table and a Genre table and a MovieGenre table to say what genres a movie has.

The django orm came up with this query when i'm trying to get Movies which have at least all of the genres a target movie has in common.

SELECT "movies_movie"."id", "movies_movie"."imdb_id", ..etc..   "movies_movie"."last_ingested_on", 
COUNT("movies_movie"."id") AS "count", 
COUNT("movies_moviegenre"."genre_id") AS "genres_count" 
FROM "movies_movie" 
LEFT OUTER JOIN "movies_moviegenre" ON ( "movies_movie"."id" = "movies_moviegenre"."movie_id" ) INNER JOIN "movies_moviegenre" T4 ON ( "movies_movie"."id" = T4."movie_id" ) 
INNER JOIN "movies_moviegenre" T6 ON ( "movies_movie"."id" = T6."movie_id" ) 
WHERE ("movies_movie"."last_ingested_on" IS NOT NULL 
AND NOT ("movies_movie"."imdb_id" = 'tt0111161' ) 
AND "movies_movie"."type" = 'feature' 
AND "movies_movie"."certification" = 'R' 
AND T4."genre_id" = 1 AND T6."genre_id" = 10 ) 
GROUP BY "movies_movie"."id", "movies_movie"."imdb_id", "movies_movie"."movie", "movies_movie"."type", "movies_movie"."year", "movies_movie"."tagline", "movies_movie"."plot", "movies_movie"."runtime", "movies_movie"."rating", "movies_movie"."certification", "movies_movie"."budget", "movies_movie"."box_office_revenue", "movies_movie"."poster_url", "movies_movie"."trailer_url", "movies_movie"."mood_data", "movies_movie"."created_on", "movies_movie"."modified_on", "movies_movie"."last_ingested_on" HAVING COUNT("movies_moviegenre"."genre_id") >= 2 
ORDER BY "count" DESC    

Can you see anything that could be causing it to be slow, it's taking 1107.26499557 ms which isn't acceptable. Thanks in advance

explain output: http://explain.depesz.com/s/lEv

Not a SQL expert but this double inner join looks strange to me, strange as in they are equivalent.

LEFT OUTER JOIN "movies_moviegenre" ON ( "movies_movie"."id" = "movies_moviegenre"."movie_id" ) INNER JOIN "movies_moviegenre" T4 ON ( "movies_movie"."id" = T4."movie_id" ) INNER JOIN "movies_moviegenre" T6 ON ( "movies_movie"."id" = T6."movie_id" )

Said that what about adding an index for columns of your where clause? I'm looking at these in particular which looks like good index candidates:

AND "movies_movie"."type" = 'feature' AND "movies_movie"."certification" = 'R' AND T4."genre_id" = 1 AND T6."genre_id" = 10 )

See https://docs.djangoproject.com/en/1.7/topics/db/optimization/#use-standard-db-optimization-techniques

The problem was i wasn't specific enough in my SELECT

SELECT "movies_movie"."id", "movies_movie"."imdb_id", ..etc.. 

i was selecting every column in the movie table, some columns contain alot of data. after changing this using djanog's queryset .only() method time dropped from 1000ms to 200ms. a good start

edit:

it was one field with alot of data in it actually, so the real solution is to put that in its own table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM