简体   繁体   English

Django `bulk_create` 与相关对象

[英]Django `bulk_create` with related objects

I have a Django system that runs billing for thousands of customers on a regular basis.我有一个 Django 系统,定期为数千名客户运行计费。 Here are my models:这是我的模型:

class Invoice(models.Model):
    balance = models.DecimalField(
        max_digits=6,
        decimal_places=2,
    )

class Transaction(models.Model):
    amount = models.DecimalField(
        max_digits=6,
        decimal_places=2,
    )
    invoice = models.ForeignKey(
        Invoice,
        on_delete=models.CASCADE,
        related_name='invoices',
        null=False
    )

When billing is run, thousands of invoices with tens of transactions each are created using several nested for loops, which triggers an insert for each created record.运行计费时,会使用多个嵌套for循环创建包含数十笔交易的数千张发票,这会触发对每个创建的记录的插入。 I could run bulk_create() on the transactions for each individual invoice, but this still results in thousands of calls to bulk_create() .我可以对每个单独发票的交易运行bulk_create() ,但这仍然会导致对bulk_create()的数千次调用。

How would one bulk-create thousands of related models so that the relationship is maintained and the database is used in the most efficient way possible?如何批量创建数以千计的相关模型,以便维持关系并以最有效的方式使用数据库?

Notes:笔记:

  • I'm looking for a native Django solution that would work on all databases (with the possible exception of SQLite).我正在寻找适用于所有数据库(SQLite 可能除外)的本机 Django 解决方案。
  • My system runs billing in a celery task to decouple long-running code from active requests, but I am still concerned with how long it takes to complete a billing cycle.我的系统在 celery 任务中运行计费,以将长时间运行的代码与活动请求分离,但我仍然关心完成计费周期需要多长时间。
  • The solution should assume that other requests or running tasks are also reading from and writing to the tables in question.该解决方案应该假定其他请求或正在运行的任务也在读取和写入相关表。

You could bulk_create all the Invoice objects, refresh them from the db, so that they all have ids, create the Transaction objects for all the invoices and then also save them with bulk_create . 您可以bulk_create所有Invoice对象,从数据库中刷新它们,以便它们都具有ID,为所有发票创建Transaction对象,然后再使用bulk_create保存它们。 All of this can be done inside a single transaction.atomic context. 所有这些都可以在单个transaction.atomic上下文中完成。

Also, specifically for django 1.10 and postrgres, look at this answer . 另外,专门针对django 1.10和postrgres,请查看此答案

You can do it with two bulk create queries, with following method.您可以使用以下方法使用两个批量创建查询来完成此操作。

new_invoices = []
new_transactions = []
for loop:
    invoice = Invoice(params)
    new_invoices.append(invoice)

    for loop: 
        transaction = Transaction(params)
        transaction.invoice = invoice
        new_transactions.append(transaction)

Invoice.objects.bulk_create(new_invoices)

for each in new_transactions:
    each.invoice_id = each.invoice.id

Transaction.objects.bulk_create(new_transactions) 

Another way for this purpose can be like the below code snippet:为此目的的另一种方法可以像下面的代码片段:

from django.utils import timezone
from django.db import transaction

new_invoices = []
new_transactions = []
for sth in sth_else:
    ...
    invoice = Invoice(params)
    new_invoices.append(invoice)

for sth in sth_else:
    ...
    new_transactions.append(transaction)

with transaction.atomic():
    other_invoice_ids = Invoice.objects.values_list('id', flat=True)
    now = timezone.now()
    Invoice.objects.bulk_create(new_invoices)

    new_invoices = Invoice.objects.exclude(id__in=other_invoice_ids).values_list('id', flat=True)
    for invoice_id in new_invoices:
                transaction = Transaction(params, invoice_id=invoice_id)
                new_transactions.append(transaction)

    Transaction.objects.bulk_create(new_transactions)

I write this answer based on this post on another question in the community.我根据这篇关于社区中另一个问题的帖子写了这个答案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM