简体   繁体   中英

Django how to make this query more efficient

I have two models:

class Answer(models.Model):
    ...

class Photo(models.Model):
    name = models.CharField(...)
    answer_id = models.Foreignkey(Answer, related_name='photos')

The relationship is one-to-many (one answer has many photos)

I need to write a reporting function that outputs all answers along with corresponding photos. Here is what I have:

def report():
    answers = Answer.objects.all()

    for answer in answers:
        result = extract_data(answer)
        for photo in answer.photos.all(): # <- executed N times, as N = # of answers in db
            result += photo.name + '\r'

        append_result_to_report(result)

This works, but as you can see it already, the number of times for photo in answer.photos.all() will be executed equal to the number of answers in the db.

Preferably, I would like to only execute two db querys, one fetches all of the answers and the other fetches all of the photos. So I tried this:

def report():
    answers = Answer.objects.all()
    photos = list(Photo.objects.all()) # <- store the result in memory
    for answer in answers:
        result = extract_data(answer)
        for photo in photos: 
            if photo.answer_id = answer.id:
                result += photo.name + '\r'

        append_result_to_report(result)

This approach has decreased the number of db queries down to two but it takes even longer to execute as whole. 9.5 seconds for this approach vs 7.5 seconds for the 1st approach.

Any suggestions on how to be more effcient?

Thank you!

PS I am using Django 1.8.2


UPDATE: I used the method suggested by @Mark Galloway, and the execution time has dropped to 1.6 seconds. The number of queries becomes 3. Django performed the following query:

  • select * from answer
  • select * from photo
  • select * from photo where 'photo.answer_id' in (19,20,3...) # the numbers inside the () does not seem to be continous

I wonder what is the purpose of the last query?

By using prefetch_related , you can walk the one-to-many relationship and fetch all of the photos in two queries. One for all of the answers, and another for all of the photos which are related to the answers.

answers = Answer.objects.all().prefetch_related('photos')

for answer in answers:
    result = extract_data(answer)
    for photo in answer.photos.all(): 
        result += photo.name + '\r'

    append_result_to_report(result)

You need to use the select_related query modifier:

def report():
    # fetches answers and photos, both, at once
    answers = Answer.objects.all().select_related('photos')

    for answer in answers:
        result = extract_data(answer)
        for photo in answer.photos.all():
            result += photo.name + '\r'

        append_result_to_report(result)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM