[英]Django: prefetch_related has no effect
I'm trying to optimize a DB query using prefetch_related
without success. 我正在尝试使用prefetch_related
优化数据库查询,但没有成功。
class Order(models.Model):
# some fields ...
@property
def last_operation(self) -> Optional['OrderOperation']:
try:
return self.orderoperation_set.latest()
except OrderOperation.DoesNotExist:
return None
@property
def total(self) -> Optional[Decimal]:
last_operation = self.last_operation
return last_operation.total if last_operation else None
class OrderOperation(TimeStampable, models.Model):
order = models.ForeignKey(Order)
total = DecimalField(max_digits=9, decimal_places=2)
orders = Order.objects.prefetch_related('orderoperation_set') # There are 1000 orders
result = sum([order.total for order in orders])
len(connection.queries)
>>> 1003
As we can see, there is one query per order.total
, so 1000 queries, that makes the whole request very bad, with performance linear to the number of orders. 如我们所见,每个order.total
有一个查询,所以有1000个查询,这使整个请求非常糟糕,性能与订单数成线性关系。
Trying to understand why this is happening, I found this in the prefetch_related Django doc : 为了理解为什么会这样,我在prefetch_related Django doc中找到了这个:
Remember that, as always with QuerySets, any subsequent chained methods which imply a different database query will ignore previously cached results, and retrieve data using a fresh database query. 请记住,与QuerySet一样,任何暗示不同数据库查询的后续链接方法都将忽略先前缓存的结果,并使用新的数据库查询来检索数据。
So, it seems normal that calling latest()
each time run a new query. 因此,每次调用latest()
运行一个新查询似乎很正常。
How would you do to improve performance in this case? 在这种情况下,您将如何提高性能? (making a few queries instead of N, with N is the number of orders). (进行一些查询而不是N,其中N是订单数)。
Since OrderOperation only contains a single relevant field, total
, a better approach would be to annotate the total of the latest operation in the original query using a subquery : 由于OrderOperation仅包含单个相关字段total
,因此更好的方法是使用子查询来注释原始查询中最新操作的总数:
from django.db.models import OuterRef, Subquery
newest = OrderOperation.objects.filter(post=OuterRef('pk')).order_by('-created_at') # or whatever the timestamp field is
orders = Order.objects.annotate(newest_operation_total=Subquery(newest.values('total')[:1]))
I'm posting an answer here, can you tell me if this makes sense or not? 我在这里发布答案,你能告诉我这是否有意义吗?
Instead of calling latest()
, what if I simply get the first item in my queryset with [0]
(or the last with len(qs)-1
, supposing that order_operations are already ordered? 而不是调用的latest()
如果我只是得到第一个项目在我的查询集有[0]
或最后用len(qs)-1
,假设order_operations已经订购?
@property
def last_operation(self) -> Optional['OrderOperation']:
try:
qs = self.orderoperation_set.all()
return qs[len(qs) - 1]
except IndexError:
return None
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.