简体   繁体   中英

Django prefetch_related and N+1 - How is it solved?

I am sitting with a query looking like this:

# Get the amount of kilo attached to products
product_data = {}
for productSpy in ProductSpy.objects.all():
    product_data[productSpy.product.product_id] = productSpy.kilo  # RERUN

I do not see how I on my last line would be able to use prefetch_related. In the examples in the docs it's very simplified and somehow makes sense, but I do not understand the whole concept enough to see myself out of this. Could I please get explained what's being done and how? I find this very important to understand, and where met by my first N+1 here.

Thank you up front for your time.

models.py

class ProductSpy(models.Model):
  created_by = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
  product = models.ForeignKey(Product, on_delete=models.CASCADE)

  def __str__(self):
    return self.kilo


class Product(models.Model):
  product_id = models.IntegerField()
  name = models.CharField(max_length=150)

  def __str__(self):
    return self.name

Django fetches related tables at runtime:

each call to productSpy.product will fetch from the table product using productSpy.id

The latency in I/O operation means that this code is highly inefficient. using prefetch_related will fetch product for all the product spy objects in one shot resulting in better performance.

# Get the amount of kilo attached to products
product_data = {}
product_spies =  ProductSpy.objects.all()
product_spies.prefetch_related('product')
product_spies.prefetch_related('kilo')
for productSpy in product_spies:
    product_data[productSpy.product.product_id] = productSpy.kilo  # RERUN

When one writes productSpy.product if the related object is not already fetched, Django makes automatically will make a query to the database to get the related Product instance. Hence if ProductSpy.objects.all() returned N instances by writing productSpy.product in a loop we will be making N more queries which is what we call N + 1 problem.

Moving further although you can use prefetch_related (will use 2 queries in your case) here it would be better for you to use select_related [Django docs] which will use a LEFT JOIN and get you the related instances in 1 query itself:

product_data = {}
queryset = ProductSpy.objects.select_related('product')
for productSpy in queryset:
    product_data[productSpy.product.product_id] = productSpy.kilo # No extra queries as we used `select_related`

Note : There seems to be some problem with your logic here though, as multiple ProductSpy instances can have the same Product , hence your loop might overwrite some values.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM