简体   繁体   中英

Select_related (and prefetch_related) not working as intended?

My models:

class Anything(models.Model):
    first_owner = models.ForeignKey(Owner, related_name='first_owner')
    second_owner = models.ForeignKey(Owner, related_name='second_owner')

class Something(Anything):
    one_related = models.ForeignKey(One, related_name='one_related', null=True)
    many_related = models.ManyToManyField(One, related_name='many_related')

class One(models.Model):
    first = models.IntegerField(null=True)
    second = models.IntegerField(null=True)

In my code i want to make a small summary of my database, with such code:

all_owners = Owner.objects.all()
first_selection = []
second_selection = []

objects = {}

for owner in all_owners:
<<0>>    
items = Something.objects.filter(Q(first_owner=owner)|Q(second_owner=owner)).order_by('date').all() 

    #Find owners, who have at least 100 "Something" elements related
    if(items.count() > 100):
        first_selection.append(owner)
        objects[owner] = items

    #Find owners, who have at least 80 "Something" with at least one many_related elements related,
    if(items.filter(many_related__isnull=False).distinct().count() > 80):
        second_selection.append(owner)
        objects[owner] = items

# Now i pass first_selection and second_selection and objects to functions, but following loops will produce the same problem im getting:

<<1>>
for owner in first_selection:
    for something in objects[owner]:
        rel = something.one_related
        print(str(rel.first) + "blablabla" + str(rel.second))
<<2>>
for owner in first_selection:
    for something in objects[owner]:
        rel = something.one_related
        print(str(rel.first) + "blablabla" + str(rel.second))

<<3>>
for owner in second_selection:
    for something in objects[owner]:
        rel = something.many_related.first()
        if rel != None""
            print(str(rel.first) + "blablabla" + str(rel.second))
<<4>>
for owner in second_selection:
    for something in objects[owner]:
        rel = something.many_related.first()
        if rel != None:
            print(str(rel.first) + "blablabla" + str(rel.second))

The problem is: <<1>> loop takes 30 minutes to execute, <<2>> loop takes 2 seconds to execute, although they use the same data.
I know why it happens - because 1st loop fetches all one_related fields and store it in cache. So i changed the code in <<0>> to:

        items = Something.objects.filter(Q(first_owner=owner)|Q(second_owner=owner)).order_by('date').select_related('one_related').all()

When i look at the generated query, it looks like it would perform join on tables.
But the problem still occurs (first loop taking minutes and second loop taking seconds), and in fact i used mysqltuner to show the number of queries performed - it grows throughout first loop, although it shouldnt...

I guess the same would apply to 3rd and 4th loop and prefetch_related, although i don't have enough memory to even test it.

So, i was aware that select_related() called without parameters would not prefetch objects that are nullable. I was not aware though, that after calling select_related('one_related') it would only select the related object's id if its fields are nullable.

To sum it up, the answer to my question would be to replace:

Something.objects.select_related('one_related')

with

Something.objects.select_related('one_related', 'one_related__first', 'one_related__second')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM