简体   繁体   English

通过 Django 中的子查询更新字段

[英]Updating fields via subquery in Django

I have an app with models and db schema that looks as shown below.我有一个带有模型和数据库架构的应用程序,如下所示。 I am trying to add field r to L2 in order to be able to access the related objects from model R. The new field is not shown in the schema figure.我正在尝试将字段r添加到 L2,以便能够从模型 R 访问相关对象。新字段未显示在架构图中。

Retrieving the desired value of field r using a subquery and an annotation works as expected.使用子查询和注释检索字段r的所需值按预期工作。 However, populating/updating the field with an update() call does not work.但是,使用update()调用填充/更新字段不起作用。 Do I have to modify my subquery?我必须修改我的子查询吗? Or is this not possible at all in Django without resorting to raw SQL?或者这在 Django 中根本不可能而不诉诸原始 SQL?

Models and schema模型和模式

from django.db import models

class L1(models.Model):
    desc = models.CharField(max_length=16)
    m1 = models.ForeignKey('M1', on_delete=models.CASCADE)

class L2(models.Model):
    desc = models.CharField(max_length=16)
    l1 = models.ForeignKey('L1', on_delete=models.CASCADE)
    m2 = models.ForeignKey('M2', on_delete=models.CASCADE)

    # r is the field added
    r = models.ForeignKey('R', null=True, default=None, on_delete=models.SET_NULL)

class M1(models.Model):
    desc = models.CharField(max_length=16)

class M2(models.Model):
    desc = models.CharField(max_length=16)

class R(models.Model):
    desc = models.CharField(max_length=16)
    m1 = models.ForeignKey('M1', on_delete=models.CASCADE)
    m2 = models.ForeignKey('M2', on_delete=models.CASCADE)

数据库模式

Sample code示例代码

from random import randint
from django.db import connection, reset_queries
from django.db.models import F, OuterRef, Subquery
from myapp.models import L1, L2, M1, M2, R

# create random data
for m in range(10):
    M1.objects.create(desc=f'M1_{m:02d}')
    M2.objects.create(desc=f'M2_{m:02d}')
for r in range(40):
    R.objects.create(desc=f'R_{r:02d}', m1_id=randint(1,10), m2_id=randint(1,10))
for l1 in range(20):
    L1.objects.create(desc=f'L1_{l1:02d}',  m1_id=randint(1,10))
for l2 in range(100):
    L2.objects.create(desc=f'L2_{l2:02d}',  l1_id=randint(1,20), m2_id=randint(1,10))

# use subquery to annotate model - success
reset_queries()
subquery = Subquery(R.objects.filter(m2_id=OuterRef('m2_id'), m1_id=OuterRef('l1__m1_id')).values('id')[:1])
annotated = L2.objects.all().annotate(_r_id=subquery)
annotated_l=list(annotated)
print(connection.queries[-1])
# query SQL-1

# use subquery to annotate and update model - failure
reset_queries()
annotated.update(r_id=F('_r_id'))
# ...
# django.db.utils.ProgrammingError: missing FROM-clause entry for table "myapp_l1"
# LINE 1: ...ECT U0."id" FROM "myapp_r" U0 WHERE (U0."m1_id" = "myapp_l1"...
#                                                              ^
print(connection.queries[-1])
# produces SQL-2

SQL-1 SQL-1

SELECT
    "myapp_l2"."id",
    "myapp_l2"."desc",
    "myapp_l2"."l1_id",
    "myapp_l2"."m2_id",
    "myapp_l2"."r_id",
    (
        SELECT
            U0."id"
        FROM
            "myapp_r" U0
        WHERE (U0."m1_id" = "myapp_l1"."m1_id"
            AND U0."m2_id" = "myapp_l2"."m2_id")
    LIMIT 1) AS "_r_id"
FROM
    "myapp_l2"
    INNER JOIN "myapp_l1" ON ("myapp_l2"."l1_id" = "myapp_l1"."id")

SQL-2 SQL-2

UPDATE
    "myapp_l2"
SET
    "r_id" = (
        SELECT
            U0."id"
        FROM
            "myapp_r" U0
        WHERE (U0."m1_id" = "myapp_l1"."m1_id"
            AND U0."m2_id" = "myapp_l2"."m2_id")
    LIMIT 1)
WHERE
    "myapp_l2"."id" IN (
        SELECT
            V0."id"
        FROM
            "myapp_l2" V0
            INNER JOIN "myapp_l1" V1 ON (V0."l1_id" = V1."id"))

The following finally did the trick.以下终于成功了。 This was inspired by the answer here .这是受到这里答案的启发。 Only, for this case, a nested subquery has to be used.只有在这种情况下,才必须使用嵌套子查询。

For the record, performance was pretty good.根据记录,性能相当不错。 It took around 50sec to update 1.5M L2 objects from 830K L1 objects, and 12K R objects.从 830K L1 对象和 12K R 对象更新 1.5M L2 对象大约需要 50 秒。

from django.db import connection, reset_queries
from django.db.models import OuterRef, Subquery
from myapp.models import L1, L2, M1, M2, R

# create queryset with annotation
subquery = Subquery(R.objects.filter(m2_id=OuterRef('m2_id'), m1_id=OuterRef('l1__m1_id')).values('id')[:1])
annotated = L2.objects.annotate(_r_id=subquery)

# use the queryset in a subquery to get the annotation value
reset_queries()
L2.objects.update(r_id=Subquery(annotated.filter(id=OuterRef('id')).values('_r_id')[:1]))
print(connection.queries[-1])
# produces SQL-good

# verify results with a loop
for l2 in L2.objects.all():
    r = R.objects.filter(m1_id=l2.l1.m1_id, m2=l2.m2_id).first()
    print(f'{str(r == l2.r):5s} {str(r):10s} {str(l2.r):10s}')

SQL-good SQL 好

UPDATE
    "myapp_l2"
SET
    "r_id" = (
        SELECT
            (
                SELECT
                    U0."id"
                FROM
                    "myapp_r" U0
                WHERE (U0."m1_id" = V1."m1_id"
                    AND U0."m2_id" = V0."m2_id")
            LIMIT 1) AS "_r_id"
    FROM
        "myapp_l2" V0
        INNER JOIN "myapp_l1" V1 ON (V0."l1_id" = V1."id")
    WHERE
        V0."id" = "myapp_l2"."id"
    LIMIT 1) '

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM