簡體   English   中英

Django:使用注釋后,聚合返回錯誤結果

[英]Django: aggregate returns a wrong result after using annotate

聚合查詢集時,我注意到如果以前使用注釋,則會得到錯誤的結果。 我不明白為什么。

代碼

from django.db.models import QuerySet, Max, F, ExpressionWrapper, DecimalField, Sum
from orders.models import OrderOperation

class OrderOperationQuerySet(QuerySet):
    def last_only(self) -> QuerySet:
        return self \
            .annotate(last_oo_pk=Max('order__orderoperation__pk')) \
            .filter(pk=F('last_oo_pk'))

    @staticmethod
    def _hist_price(orderable_field):
        return ExpressionWrapper(
            F(f'{orderable_field}__hist_unit_price') * F(f'{orderable_field}__quantity'),
            output_field=DecimalField())

    def ordered_articles_data(self):
        return self.aggregate(
            sum_ordered_articles_amounts=Sum(self._hist_price('orderedarticle')))

考試

qs1 = OrderOperation.objects.filter(order__pk=31655)
qs2 = OrderOperation.objects.filter(order__pk=31655).last_only()
assert qs1.count() == qs2.count() == 1 and qs1[0] == qs2[0]  # shows that both querysets contains the same object

qs1.ordered_articles_data()
> {'sum_ordered_articles_amounts': Decimal('3.72')}  # expected result

qs2.ordered_articles_data()
> {'sum_ordered_articles_amounts': Decimal('3.01')}  # wrong result

last_only批注方法如何使聚合結果不同(錯誤)?

“有趣”的事情似乎只有在訂單包含具有相同hist_price商品時才會發生: 在此處輸入圖片說明

邊注

SQL查詢 (請注意,這些是實際的查詢,但是上面的代碼已稍作簡化,這解釋了下面出現的COALESCE"deleted" IS NULL

qs1.ordered_articles_data()

SELECT
    COALESCE(
        SUM(
            ("orders_orderedarticle"."hist_unit_price" * "orders_orderedarticle"."quantity")
        ),
        0) AS "sum_ordered_articles_amounts"
FROM "orders_orderoperation"
    LEFT OUTER JOIN "orders_orderedarticle"
        ON ("orders_orderoperation"."id" = "orders_orderedarticle"."order_operation_id")
WHERE ("orders_orderoperation"."order_id" = 31655 AND "orders_orderoperation"."deleted" IS NULL)

qs2.ordered_articles_data()

SELECT COALESCE(SUM(("__col1" * "__col2")), 0)
FROM (
    SELECT
        "orders_orderoperation"."id" AS Col1,
        MAX(T3."id") AS "last_oo_pk",
        "orders_orderedarticle"."hist_unit_price" AS "__col1",
        "orders_orderedarticle"."quantity" AS "__col2"
    FROM "orders_orderoperation" INNER JOIN "orders_order"
        ON ("orders_orderoperation"."order_id" = "orders_order"."id")
        LEFT OUTER JOIN "orders_orderoperation" T3
            ON ("orders_order"."id" = T3."order_id")
        LEFT OUTER JOIN "orders_orderedarticle"
            ON ("orders_orderoperation"."id" = "orders_orderedarticle"."order_operation_id")
    WHERE ("orders_orderoperation"."order_id" = 31655 AND "orders_orderoperation"."deleted" IS NULL)
    GROUP BY
        "orders_orderoperation"."id",
        "orders_orderedarticle"."hist_unit_price",
        "orders_orderedarticle"."quantity"
    HAVING "orders_orderoperation"."id" = (MAX(T3."id"))
) subquery

當您使用數據庫語言( 聚合函數 )中的任何annotation ,您應該按函數以外的所有字段進行分組,並且可以在子查詢中看到它

GROUP BY
    "orders_orderoperation"."id",
    "orders_orderedarticle"."hist_unit_price",
    "orders_orderedarticle"."quantity"
HAVING "orders_orderoperation"."id" = (MAX(T3."id"))

結果,具有相同hist_unit_pricequantity將通過max id過濾。 因此,根據您的屏幕, chocolatecafe被排除在條件之外。

分離具有較小聯接的子查詢是一種解決方案,它可以防止對子對象進行更多聯接的問題,可能是由於不必要的巨大的笛卡爾積獨立集或對結果SQL中GROUP BY子句的復雜控制(由來自更多的元素的貢獻)查詢。

解決方案 :子查詢用於獲取最后順序操作的主鍵。 沒有添加聯接或組的簡單查詢通常不會因子級上的可能聚集而失真。

    def last_only(self) -> QuerySet:
        max_ids = (self.values('order').order_by()
                   .annotate(last_oo_pk=Max('order__orderoperation__pk'))
                   .values('last_oo_pk')
                   )
        return self.filter(pk__in=max_ids)

測試

ret = (OrderOperationQuerySet(OrderOperation).filter(order__in=[some_order])
       .last_only().ordered_articles_data())

執行SQL:(刪除應用程序的名稱前綴簡化order_和雙quetes "

SELECT CAST(SUM((orderedarticle.hist_unit_price * orderedarticle.quantity))
       AS NUMERIC) AS sum_ordered_articles_amounts
FROM orderoperation
LEFT OUTER JOIN orderedarticle ON (orderoperation.id = orderedarticle.order_operation_id)
WHERE (
  orderoperation.order_id IN (31655) AND
  orderoperation.id IN (
    SELECT MAX(U2.id) AS last_oo_pk
    FROM orderoperation U0
    INNER JOIN order U1 ON (U0.order_id = U1.id)
    LEFT OUTER JOIN orderoperation U2 ON (U1.id = U2.order_id)
    WHERE U0.order_id IN (31655)
    GROUP BY U0.order_id
  )
)

可以通過在GROUP BY添加orders_orderedarticle".id來修復原始無效的SQL,但last_only()是必須同時使用last_only()ordered_articles_data() 。這種可讀性不好的方法。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM