如何减少查询的执行时间

Question

I have a sum query that takes a long time to give the result, for about 40 seconds.我有一个 sum 查询需要很长时间才能给出结果，大约 40 秒。 Here's my query:这是我的查询：

SELECT
    id_article, 
    sum(qte) as total 
FROM 
    Mouvstk
WHERE 
    date >= '20180609'
GROUP BY 
    id_article

I've created an index on id_article and another one on date.我在 id_article 上创建了一个索引，在 date 上创建了另一个索引。 There are about 16 millions rows.大约有 1600 万行。

when i run explain analyze verbose i get this result:当我运行解释分析详细时，我得到了这个结果：

Finalize GroupAggregate  (cost=440073.16..443607.99 rows=6779 width=40) (actual time=25504.816..25562.865 rows=14142 loops=1)
  Output: id_article, sum(qte)
  Group Key: mouvstk.id_article
  ->  Gather Merge  (cost=440073.16..443319.89 rows=27116 width=40) (actual time=25504.799..25580.712 rows=63081 loops=1)
        Output: id_article, (PARTIAL sum(qte))
        Workers Planned: 4
        Workers Launched: 4
        ->  Sort  (cost=439073.10..439090.05 rows=6779 width=40) (actual time=25446.155..25447.759 rows=12616 loops=5)
              Output: id_article, (PARTIAL sum(qte))
              Sort Key: mouvstk.id_article
              Sort Method: quicksort  Memory: 1434kB
              Worker 0:  Sort Method: quicksort  Memory: 1431kB
              Worker 1:  Sort Method: quicksort  Memory: 1428kB
              Worker 2:  Sort Method: quicksort  Memory: 1430kB
              Worker 3:  Sort Method: quicksort  Memory: 1430kB
              Worker 0: actual time=25433.322..25434.870 rows=12618 loops=1
              Worker 1: actual time=25435.450..25437.032 rows=12599 loops=1
              Worker 2: actual time=25427.157..25428.702 rows=12611 loops=1
              Worker 3: actual time=25432.809..25434.284 rows=12599 loops=1
              ->  Partial HashAggregate  (cost=438556.99..438641.73 rows=6779 width=40) (actual time=25432.515..25441.923 rows=12616 loops=5)
                    Output: id_article, PARTIAL sum(qte)
                    Group Key: mouvstk.id_article
                    Worker 0: actual time=25417.656..25428.424 rows=12618 loops=1
                    Worker 1: actual time=25424.587..25432.008 rows=12599 loops=1
                    Worker 2: actual time=25416.391..25423.729 rows=12611 loops=1
                    Worker 3: actual time=25417.598..25428.208 rows=12599 loops=1
                    ->  Parallel Seq Scan on public.mouvstk  (cost=0.00..429549.32 rows=1801535 width=13) (actual time=454.411..24611.221 rows=1439376 loops=5)
                          Output: code_origine, numero_caisse, numero_document, date, code_clifour, code_vendeur, code_affaire, code_magasin, numero_serie, libelle, puht, puhtnet, puttc, puttcnet, taux_remise, code_tva, taux_tva, code_devise, parite_devise, frais_approche, prht, nomenclature, type_vente, code_tarif, code_categorie_achat, numero_lot, date_peremption, pvttcstd, lib_tarif, id_ligne_document, id, id_article, qte, id_clifour
                          Filter: (mouvstk.date >= '2018-06-09'::date)
                          Rows Removed by Filter: 1791877
                          Worker 0: actual time=438.619..24600.391 rows=1428362 loops=1
                          Worker 1: actual time=445.653..24609.448 rows=1425821 loops=1
                          Worker 2: actual time=437.424..24600.521 rows=1430897 loops=1
                          Worker 3: actual time=438.652..24605.422 rows=1430127 loops=1
Planning Time: 0.356 ms
Execution Time: 25624.787 ms

Can someone explain to me why the query is so long and help me to reduce this execution time?有人可以向我解释为什么查询这么长并帮助我减少执行时间吗？

thx.谢谢。

Answer 1

Try the following compound index:尝试以下复合索引：

CREATE INDEX idx ON Mouvstk (date, id_article, qte);

Including the date will allow Postgres to filter off records earlier than June 9th, 2018. Then, for what remains of the B tree, Postgres may aggregate all records belonging to a given article in order, by simply scanning the index once.包括date将允许 Postgres 过滤掉早于 2018 年 6 月 9 日的记录。然后，对于 B 树的剩余部分，Postgres 可以通过简单地扫描索引一次来按顺序聚合属于给定文章的所有记录。 Note that I also include qte at the end of the index, to avoid the need to seek back to the main table to find this value.请注意，我还将qte包含在索引的末尾，以避免需要回溯到主表以找到该值。 This index is said to completely cover your query.据说该索引可以完全覆盖您的查询。

Make sure you VACUUM the table to get efficient index-only scans.确保对表进行VACUUM以获得有效的仅索引扫描。

如何减少查询的执行时间

问题描述

1 个解决方案

解决方案1
3 已采纳 2021-06-09 10:34:25

如何减少查询的执行时间

问题描述

1 个解决方案

解决方案1 3 已采纳 2021-06-09 10:34:25

解决方案1
3 已采纳 2021-06-09 10:34:25