简体   繁体   English

PostgreSQL 查询优化挑战

[英]PostgreSQL query optimization challenge

I am trying to optimize this query:我正在尝试优化此查询:

SELECT eq.*,
    reg_last_dt.dt as reg_last_date
FROM Equipment eq
  INNER JOIN (
    select max( dt ) as dt, id_eq_equipment 
    from consum 
    group by id_eq_equipment
  ) as reg_last_dt ON reg_last_dt.id_eq_equipment = eq.id_eq

Explain shows me this:解释向我展示了这个:

Hash Join  (cost=839806.69..839833.33 rows=23 width=1461)
  Hash Cond: (eq.id_eq = consum.id_eq_equipment)
  ->  Seq Scan on equipment eq  (cost=0.00..26.29 rows=129 width=1453)
  ->  Hash  (cost=839806.40..839806.40 rows=23 width=10)
        ->  Finalize GroupAggregate  (cost=839805.60..839806.17 rows=23 width=10)
              Group Key: consum.id_eq_equipment
              ->  Sort  (cost=839805.60..839805.71 rows=46 width=10)
                    Sort Key: consum.id_eq_equipment
                    ->  Gather  (cost=839799.50..839804.33 rows=46 width=10)
                          Workers Planned: 2
                          ->  Partial HashAggregate  (cost=838799.50..838799.73 rows=23 width=10)
                                Group Key: consum.id_eq_equipment
                                ->  Parallel Seq Scan on consum  (cost=0.00..755192.33 rows=16721433 width=10)

This looks not very optimal.这看起来不是很理想。 Is there anything I could do to make it better?有什么我可以做的让它变得更好吗?

The row estimates in the query plan (only rows=129 for Equipment , and only rows=23 for aggregated consum ) indicate that this query using a LATERAL subquery instead should perform much faster:查询计划中的行估计(只有rows=129用于Equipment ,只有rows=23用于聚合consum )表明这个使用LATERAL子查询的查询应该执行更快:

SELECT eq.*, r.reg_last_date
FROM   Equipment eq
CROSS  JOIN LATERAL (
   SELECT max(dt) AS reg_last_date
   FROM   consum c
   WHERE  c.id_eq_equipment = eq.id_eq
   ) r;

Be sure to have a multicolumn index on consum(id_eq_equipment, dt) !确保在consum(id_eq_equipment, dt)上有一个多列索引

Related:有关的:

Maybe your really want a LEFT JOIN to return all rows from Equipment ?也许你真的想要一个LEFT JOIN来返回Equipment的所有行? See:看:

If the estimates in the plan are correct, it would almost certainly be faster to do it with a subselect, this way:如果计划中的估计是正确的,那么使用子选择几乎肯定会更快,这样:

SELECT 
    eq.*, 
    (select max( dt ) from consum where consum.id_eq_equipment = eq.id_eq) as reg_last_date
FROM Equipment eq

Note this will return NULL for reg_last_date where there is no corresponding record in consum, so you might want to filter those out if you don't want to see them.请注意,这将为 reg_last_date 返回 NULL,其中在 consum 中没有相应的记录,因此如果您不想看到它们,您可能需要过滤掉它们。

You would need an index on (id_eq_equipment, dt) to make it fast您需要在(id_eq_equipment, dt)上建立索引以加快速度

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM