簡體   English   中英

分析並提高由Django ORM(SORT)生成的查詢的性能

[英]Analysing and improving the performance of a query generated by Django's ORM (SORT)

根據我的Django模型中的幾種條件,我有一個不太復雜的(imho)過濾邏輯。 有一個特定的查詢需要很長時間才能完成。

該查詢是基於這兩個查詢集構建的:

queryset = self.serializer_class.Meta.model.valid_pricelist_objects.filter(
    Q(drug_prices__pricelist__price_destination__to_all_insurances=True
      ) |
    # pylint: disable=line-too-long
    Q(
        drug_prices__pricelist__price_destination__to_organization_data__organization__uuid
        =self.kwargs.get('i_uuid')))
return queryset

return super().get_queryset().filter(
    Q(active=True),
    Q(drug_prices__pricelist__active=True), # Lista de precios activa
    # Q(drug_pictures__is_main=True), # Que tenga una imagen
    # TODO: Hacer filtros por pais PriceListCountries
    Q(
        Q(drug_prices__pricelist__expires=False) | # Que tenga precios que no caducan o
        Q(
            Q(drug_prices__pricelist__expires=True), # Que tenga precios que si caducan Y
            Q(drug_prices__pricelist__datestart__date__lte=timezone.now()),  # Fecha de inicio menor que hoy Y
            Q(drug_prices__pricelist__dateend__date__gte=timezone.now())  # Fecha final mayor que hoy
        )
    )
).distinct().prefetch_related(
    'categories__contexts',
    'brands',
    'drug_prices__pricelist',
    'drug_pictures',
    'drug_prices__pricelist__price_destination',
)

第二個查詢集包裝第一個查詢集(通過super()調用)。 結果查詢如下所示:

SELECT DISTINCT
    "phdrug_phdrug"."id",
    "phdrug_phdrug"."uuid",
    "phdrug_phdrug"."default_description",
    "phdrug_phdrug"."ean",
    "phdrug_phdrug"."parent_ean",
    "phdrug_phdrug"."reg_num",
    "phdrug_phdrug"."atc_iv",
    "phdrug_phdrug"."product_type",
    "phdrug_phdrug"."fraction",
    "phdrug_phdrug"."active",
    "phdrug_phdrug"."loyal",
    "phdrug_phdrug"."patent",
    "phdrug_phdrug"."chronics",
    "phdrug_phdrug"."recipe",
    "phdrug_phdrug"."deal",
    "phdrug_phdrug"."specialized",
    "phdrug_phdrug"."armored",
    "phdrug_phdrug"."hight_speciality",
    "phdrug_phdrug"."temp_8_15",
    "phdrug_phdrug"."temp_15_25",
    "phdrug_phdrug"."temp_2_8",
    "phdrug_phdrug"."temp_less_15",
    "phdrug_phdrug"."new",
    "phdrug_phdrug"."mdk_internal_code",
    "phdrug_phdrug"."mdk_single_id",
    "phdrug_phdrug"."is_from_mdk_db",
    "phdrug_phdrug"."top",
    "phdrug_phdrug"."laboratory_id",
    "phdrug_phdrug"."specialty_id" 
FROM
    "phdrug_phdrug"
    INNER JOIN "monetary_drugprice" ON ( "phdrug_phdrug"."id" = "monetary_drugprice"."drug_id" )
    INNER JOIN "monetary_pricelist" ON ( "monetary_drugprice"."pricelist_id" = "monetary_pricelist"."id" )
    INNER JOIN "monetary_drugprice" T4 ON ( "phdrug_phdrug"."id" = T4."drug_id" )
    INNER JOIN "monetary_pricelist" T5 ON ( T4."pricelist_id" = T5."id" )
    INNER JOIN "monetary_pricelistdestinations" ON ( T5."id" = "monetary_pricelistdestinations"."pricelist_id" )
    LEFT OUTER JOIN "organization_organizationdata" ON ( "monetary_pricelistdestinations"."to_organization_data_id" = "organization_organizationdata"."id" )
    LEFT OUTER JOIN "organization_organization" ON ( "organization_organizationdata"."organization_id" = "organization_organization"."id" ) 
WHERE
    (
        "phdrug_phdrug"."active" = TRUE 
        AND "monetary_pricelist"."active" = TRUE 
        AND (
            "monetary_pricelist"."expires" = FALSE 
            OR (
                "monetary_pricelist"."expires" = TRUE 
                AND ( "monetary_pricelist"."datestart" AT TIME ZONE'UTC' ) :: DATE <= '2019-01-22' 
                AND ( "monetary_pricelist"."dateend" AT TIME ZONE'UTC' ) :: DATE >= '2019-01-22' 
            )) 
        AND (
            "monetary_pricelistdestinations"."to_all_insurances" = TRUE 
            OR "organization_organization"."uuid" = 'b51773d4-05f8-43a2-86ef-0098b31725d8' 
        )) 
ORDER BY
    "phdrug_phdrug"."default_description" ASC

使用EXPLAIN ANALYZE運行查詢,我得到以下信息:

Unique  (cost=10412.31..12666.32 rows=29084 width=143) (actual time=3373.496..3620.090 rows=6442 loops=1)
  ->  Sort  (cost=10412.31..10485.02 rows=29084 width=143) (actual time=3373.494..3460.790 rows=228667 loops=1)
        Sort Key: phdrug_phdrug.default_description, phdrug_phdrug.id, phdrug_phdrug.uuid, phdrug_phdrug.ean, phdrug_phdrug.parent_ean, phdrug_phdrug.reg_num, phdrug_phdrug.medika_code, phdrug_phdrug.atc_iv, phdrug_phdrug.product_type, phdrug_phdrug.fraction, phdrug_phdrug.active, phdrug_phdrug.loyal, phdrug_phdrug.patent, phdrug_phdrug.chronics, phdrug_phdrug.recipe, phdrug_phdrug.deal, phdrug_phdrug.specialized, phdrug_phdrug.armored, phdrug_phdrug.hight_speciality, phdrug_phdrug.temp_8_15, phdrug_phdrug.temp_15_25, phdrug_phdrug.temp_2_8, phdrug_phdrug.temp_less_15, phdrug_phdrug.new, phdrug_phdrug.mdk_internal_code, phdrug_phdrug.mdk_single_id, phdrug_phdrug.is_from_mdk_db, phdrug_phdrug.top, phdrug_phdrug.laboratory_id, phdrug_phdrug.specialty_id
        Sort Method: external merge  Disk: 31192kB
        ->  Hash Join  (cost=704.51..6166.54 rows=29084 width=143) (actual time=23.648..507.099 rows=228667 loops=1)
              Hash Cond: (monetary_drugprice.pricelist_id = monetary_pricelist.id)
              ->  Nested Loop  (cost=696.92..5604.95 rows=44105 width=147) (actual time=22.881..416.630 rows=457692 loops=1)
                    Join Filter: (phdrug_phdrug.id = monetary_drugprice.drug_id)
                    ->  Hash Join  (cost=696.51..1177.21 rows=4583 width=147) (actual time=22.864..38.841 rows=23577 loops=1)
                          Hash Cond: (phdrug_phdrug.id = t4.drug_id)
                          ->  Seq Scan on phdrug_phdrug  (cost=0.00..359.94 rows=11992 width=143) (actual time=0.438..3.593 rows=11992 loops=1)
                                Filter: active
                                Rows Removed by Filter: 2
                          ->  Hash  (cost=639.21..639.21 rows=4584 width=4) (actual time=22.339..22.339 rows=23577 loops=1)
                                Buckets: 32768 (originally 8192)  Batches: 1 (originally 1)  Memory Usage: 1085kB
                                ->  Nested Loop  (cost=3.99..639.21 rows=4584 width=4) (actual time=1.785..16.702 rows=23577 loops=1)
                                      ->  Nested Loop  (cost=3.58..9.11 rows=5 width=8) (actual time=1.756..1.874 rows=7 loops=1)
                                            ->  Hash Left Join  (cost=3.43..7.57 rows=5 width=4) (actual time=1.733..1.797 rows=7 loops=1)
                                                  Hash Cond: (monetary_pricelistdestinations.to_organization_data_id = organization_organizationdata.id)
                                                  Filter: (monetary_pricelistdestinations.to_all_insurances OR (organization_organization.uuid = 'b51773d4-05f8-43a2-86ef-0098b31725d8'::uuid))
                                                  Rows Removed by Filter: 130
                                                  ->  Seq Scan on monetary_pricelistdestinations  (cost=0.00..3.37 rows=137 width=9) (actual time=0.626..0.643 rows=137 loops=1)
                                                  ->  Hash  (cost=3.12..3.12 rows=25 width=20) (actual time=1.076..1.076 rows=25 loops=1)
                                                        Buckets: 1024  Batches: 1  Memory Usage: 10kB
                                                        ->  Hash Left Join  (cost=1.56..3.12 rows=25 width=20) (actual time=1.040..1.053 rows=25 loops=1)
                                                              Hash Cond: (organization_organizationdata.organization_id = organization_organization.id)
                                                              ->  Seq Scan on organization_organizationdata  (cost=0.00..1.25 rows=25 width=8) (actual time=0.501..0.504 rows=25 loops=1)
                                                              ->  Hash  (cost=1.25..1.25 rows=25 width=20) (actual time=0.513..0.513 rows=25 loops=1)
                                                                    Buckets: 1024  Batches: 1  Memory Usage: 10kB
                                                                    ->  Seq Scan on organization_organization  (cost=0.00..1.25 rows=25 width=20) (actual time=0.484..0.501 rows=25 loops=1)
                                            ->  Index Only Scan using monetary_pricelist_pkey on monetary_pricelist t5  (cost=0.14..0.31 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=7)
                                                  Index Cond: (id = monetary_pricelistdestinations.pricelist_id)
                                                  Heap Fetches: 7
                                      ->  Index Scan using monetary_drugprice_pricelist_id_1ce160ce on monetary_drugprice t4  (cost=0.42..110.21 rows=1581 width=8) (actual time=0.010..1.236 rows=3368 loops=7)
                                            Index Cond: (pricelist_id = t5.id)
                    ->  Index Scan using monetary_drugprice_drug_id_c2f278e5 on monetary_drugprice  (cost=0.42..0.78 rows=15 width=8) (actual time=0.002..0.009 rows=19 loops=23577)
                          Index Cond: (drug_id = t4.drug_id)
              ->  Hash  (cost=6.45..6.45 rows=91 width=4) (actual time=0.757..0.757 rows=93 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 12kB
                    ->  Seq Scan on monetary_pricelist  (cost=0.00..6.45 rows=91 width=4) (actual time=0.655..0.713 rows=93 loops=1)
                          Filter: (active AND ((NOT expires) OR (expires AND ((timezone('UTC'::text, datestart))::date <= '2019-01-22'::date) AND ((timezone('UTC'::text, dateend))::date >= '2019-01-22'::date))))
                          Rows Removed by Filter: 45
Planning time: 25.871 ms
Execution time: 3638.544 ms

如果將其粘貼到explain.depesz.comexplain.depesz.com得到此結果

查詢說明

您幾乎可以在這里看到查詢所花費的全部時間都花在了排序上:

在此處輸入圖片說明

...這就是我完全迷失的地方。 什么排序? 這不是查詢末尾的ORDER BY ,我已經嘗試刪除它。

如何改善此查詢的性能?

編輯:刪除ORDER BY之后,我要添加EXPLAIN ANALYZE 該查詢需要更長的時間才能一致地完成。 這就給問題增加了另一層wtf

Unique  (cost=10412.31..12666.32 rows=29084 width=143) (actual time=4229.508..4427.560 rows=6442 loops=1)
  ->  Sort  (cost=10412.31..10485.02 rows=29084 width=143) (actual time=4229.506..4274.698 rows=228667 loops=1)
        Sort Key: phdrug_phdrug.id, phdrug_phdrug.uuid, phdrug_phdrug.default_description, phdrug_phdrug.ean, phdrug_phdrug.parent_ean, phdrug_phdrug.reg_num, phdrug_phdrug.medika_code, phdrug_phdrug.atc_iv, phdrug_phdrug.product_type, phdrug_phdrug.fraction, phdrug_phdrug.active, phdrug_phdrug.loyal, phdrug_phdrug.patent, phdrug_phdrug.chronics, phdrug_phdrug.recipe, phdrug_phdrug.deal, phdrug_phdrug.specialized, phdrug_phdrug.armored, phdrug_phdrug.hight_speciality, phdrug_phdrug.temp_8_15, phdrug_phdrug.temp_15_25, phdrug_phdrug.temp_2_8, phdrug_phdrug.temp_less_15, phdrug_phdrug.new, phdrug_phdrug.mdk_internal_code, phdrug_phdrug.mdk_single_id, phdrug_phdrug.is_from_mdk_db, phdrug_phdrug.top, phdrug_phdrug.laboratory_id, phdrug_phdrug.specialty_id
        Sort Method: external merge  Disk: 31160kB
        ->  Hash Join  (cost=704.51..6166.54 rows=29084 width=143) (actual time=21.814..605.830 rows=228667 loops=1)
              Hash Cond: (monetary_drugprice.pricelist_id = monetary_pricelist.id)
              ->  Nested Loop  (cost=696.92..5604.95 rows=44105 width=147) (actual time=21.195..501.337 rows=457692 loops=1)
                    Join Filter: (phdrug_phdrug.id = monetary_drugprice.drug_id)
                    ->  Hash Join  (cost=696.51..1177.21 rows=4583 width=147) (actual time=21.180..43.205 rows=23577 loops=1)
                          Hash Cond: (phdrug_phdrug.id = t4.drug_id)
                          ->  Seq Scan on phdrug_phdrug  (cost=0.00..359.94 rows=11992 width=143) (actual time=0.491..5.225 rows=11992 loops=1)
                                Filter: active
                                Rows Removed by Filter: 2
                          ->  Hash  (cost=639.21..639.21 rows=4584 width=4) (actual time=20.589..20.589 rows=23577 loops=1)
                                Buckets: 32768 (originally 8192)  Batches: 1 (originally 1)  Memory Usage: 1085kB
                                ->  Nested Loop  (cost=3.99..639.21 rows=4584 width=4) (actual time=1.252..15.098 rows=23577 loops=1)
                                      ->  Nested Loop  (cost=3.58..9.11 rows=5 width=8) (actual time=1.182..1.292 rows=7 loops=1)
                                            ->  Hash Left Join  (cost=3.43..7.57 rows=5 width=4) (actual time=1.164..1.226 rows=7 loops=1)
                                                  Hash Cond: (monetary_pricelistdestinations.to_organization_data_id = organization_organizationdata.id)
                                                  Filter: (monetary_pricelistdestinations.to_all_insurances OR (organization_organization.uuid = 'b51773d4-05f8-43a2-86ef-0098b31725d8'::uuid))
                                                  Rows Removed by Filter: 130
                                                  ->  Seq Scan on monetary_pricelistdestinations  (cost=0.00..3.37 rows=137 width=9) (actual time=0.347..0.364 rows=137 loops=1)
                                                  ->  Hash  (cost=3.12..3.12 rows=25 width=20) (actual time=0.794..0.794 rows=25 loops=1)
                                                        Buckets: 1024  Batches: 1  Memory Usage: 10kB
                                                        ->  Hash Left Join  (cost=1.56..3.12 rows=25 width=20) (actual time=0.774..0.786 rows=25 loops=1)
                                                              Hash Cond: (organization_organizationdata.organization_id = organization_organization.id)
                                                              ->  Seq Scan on organization_organizationdata  (cost=0.00..1.25 rows=25 width=8) (actual time=0.317..0.319 rows=25 loops=1)
                                                              ->  Hash  (cost=1.25..1.25 rows=25 width=20) (actual time=0.432..0.432 rows=25 loops=1)
                                                                    Buckets: 1024  Batches: 1  Memory Usage: 10kB
                                                                    ->  Seq Scan on organization_organization  (cost=0.00..1.25 rows=25 width=20) (actual time=0.411..0.415 rows=25 loops=1)
                                            ->  Index Only Scan using monetary_pricelist_pkey on monetary_pricelist t5  (cost=0.14..0.31 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=7)
                                                  Index Cond: (id = monetary_pricelistdestinations.pricelist_id)
                                                  Heap Fetches: 7
                                      ->  Index Scan using monetary_drugprice_pricelist_id_1ce160ce on monetary_drugprice t4  (cost=0.42..110.21 rows=1581 width=8) (actual time=0.012..1.127 rows=3368 loops=7)
                                            Index Cond: (pricelist_id = t5.id)
                    ->  Index Scan using monetary_drugprice_drug_id_c2f278e5 on monetary_drugprice  (cost=0.42..0.78 rows=15 width=8) (actual time=0.002..0.012 rows=19 loops=23577)
                          Index Cond: (drug_id = t4.drug_id)
              ->  Hash  (cost=6.45..6.45 rows=91 width=4) (actual time=0.609..0.609 rows=93 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 12kB
                    ->  Seq Scan on monetary_pricelist  (cost=0.00..6.45 rows=91 width=4) (actual time=0.539..0.582 rows=93 loops=1)
                          Filter: (active AND ((NOT expires) OR (expires AND ((timezone('UTC'::text, datestart))::date <= '2019-01-22'::date) AND ((timezone('UTC'::text, dateend))::date >= '2019-01-22'::date))))
                          Rows Removed by Filter: 45
Planning time: 25.288 ms
Execution time: 4440.406 ms

EXPLAIN ANALYZE結果中進行排序的原因是,在查詢中您使用的是SELECT DISTINCT因此數據庫按行的所有列進行排序,並刪除重復項。

查看您的查詢,也許您不需要對sql表達式中的所有列進行區分。

將DISTINCT應用於所需的列

嘗試將.distinct()替換為.distinct('pk')

請參考Djangodistinct distinct()文檔

僅在PostgreSQL上,您可以傳遞位置參數(* fields)以指定DISTINCT應該應用的字段名稱。 這將轉換為SELECT DISTINCT ON SQL查詢。 這是區別。 對於正常的distinct()調用,數據庫在確定哪些行不同時會比較每行中的每個字段。 對於具有指定字段名稱的distinct()調用,數據庫將僅比較指定的字段名稱。

嘗試更改排序方式

您的查詢計划表明行排序發生在磁盤而不是內存中。

Sort Method: external merge  Disk: 31160kB

您可以嘗試其他答案的解決方案

使用LIMIT

另外,您是否真的需要一批中的所有數據。 您是否嘗試過添加分頁?

使用GROUP BY

我見過人們試圖將其查詢從DISTINCT轉換為GROUP BY但是我不確定在您的情況下是否可行。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM