简体   繁体   English

Django:prefetch_related无效

[英]Django: prefetch_related has no effect

I'm trying to optimize a DB query using prefetch_related without success. 我正在尝试使用prefetch_related优化数据库查询,但没有成功。

models.py models.py

class Order(models.Model):
    # some fields ...

    @property
    def last_operation(self) -> Optional['OrderOperation']:
        try:
            return self.orderoperation_set.latest()
        except OrderOperation.DoesNotExist:
            return None

    @property
    def total(self) -> Optional[Decimal]:
        last_operation = self.last_operation
        return last_operation.total if last_operation else None

class OrderOperation(TimeStampable, models.Model):
    order = models.ForeignKey(Order)
    total = DecimalField(max_digits=9, decimal_places=2)

Running a shell, I can see the problem: 运行一个shell,我可以看到问题所在:

orders = Order.objects.prefetch_related('orderoperation_set')  # There are 1000 orders
result = sum([order.total for order in orders])
len(connection.queries)
>>> 1003

As we can see, there is one query per order.total , so 1000 queries, that makes the whole request very bad, with performance linear to the number of orders. 如我们所见,每个order.total有一个查询,所以有1000个查询,这使整个请求非常糟糕,性能与订单数成线性关系。

Trying to understand why this is happening, I found this in the prefetch_related Django doc : 为了理解为什么会这样,我在prefetch_related Django doc中找到了这个:

Remember that, as always with QuerySets, any subsequent chained methods which imply a different database query will ignore previously cached results, and retrieve data using a fresh database query. 请记住,与QuerySet一样,任何暗示不同数据库查询的后续链接方法都将忽略先前缓存的结果,并使用新的数据库查询来检索数据。

So, it seems normal that calling latest() each time run a new query. 因此,每次调用latest()运行一个新查询似乎很正常。

How would you do to improve performance in this case? 在这种情况下,您将如何提高性能? (making a few queries instead of N, with N is the number of orders). (进行一些查询而不是N,其中N是订单数)。

Since OrderOperation only contains a single relevant field, total , a better approach would be to annotate the total of the latest operation in the original query using a subquery : 由于OrderOperation仅包含单个相关字段total ,因此更好的方法是使用查询来注释原始查询中最新操作的总数:

from django.db.models import OuterRef, Subquery
newest = OrderOperation.objects.filter(post=OuterRef('pk')).order_by('-created_at')  # or whatever the timestamp field is
orders = Order.objects.annotate(newest_operation_total=Subquery(newest.values('total')[:1]))

I'm posting an answer here, can you tell me if this makes sense or not? 我在这里发布答案,你能告诉我这是否有意义吗?

Instead of calling latest() , what if I simply get the first item in my queryset with [0] (or the last with len(qs)-1 , supposing that order_operations are already ordered? 而不是调用的latest()如果我只是得到第一个项目在我的查询集有[0]或最后用len(qs)-1 ,假设order_operations已经订购?

@property
def last_operation(self) -> Optional['OrderOperation']:
    try:
        qs = self.orderoperation_set.all()
        return qs[len(qs) - 1]
    except IndexError:
        return None

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM