PostgreSQL 9.6针对时间戳列聚合期间选择了错误的计划

Question

我有一个简单但相当大的表“ log”，其中包含三列：user_id，day，hours。

user_id character varying(36) COLLATE pg_catalog."default" NOT NULL,
day timestamp without time zone,
hours double precision

所有列都有索引。

问题在于，针对“ day”字段的汇总工作非常缓慢。 例如，简单查询需要永恒才能完成。

select min(day) from log where user_id = 'ab056f5a-390b-41d7-ba56-897c14b679bf'

分析表明Postgres进行了全面扫描，过滤了与user_id ='ab056f5a-390b-41d7-ba56-897c14b679bf'不相关的条目，这绝对是违反直觉的

[
  {
    "Execution Time": 146502.05,
    "Planning Time": 0.893,
    "Plan": {
      "Startup Cost": 789.02,
      "Actual Rows": 1,
      "Plans": [
        {
          "Startup Cost": 0.44,
          "Actual Rows": 1,
          "Plans": [
            {
              "Index Cond": "(log.day IS NOT NULL)",
              "Startup Cost": 0.44,
              "Scan Direction": "Forward",
              "Plan Width": 8,
              "Rows Removed by Index Recheck": 0,
              "Actual Rows": 1,
              "Node Type": "Index Scan",
              "Total Cost": 1395792.54,
              "Plan Rows": 1770,
              "Relation Name": "log",
              "Alias": "log",
              "Parallel Aware": false,
              "Actual Total Time": 146502.015,
              "Output": [
                "log.day"
              ],
              "Parent Relationship": "Outer",
              "Actual Startup Time": 146502.015,
              "Schema": "public",
              "Filter": "((log.user_id)::text = 'ab056f5a-390b-41d7-ba56-897c14b679bf'::text)",
              "Actual Loops": 1,
              "Rows Removed by Filter": 12665610,
              "Index Name": "index_log_day"
            }
          ],
          "Node Type": "Limit",
          "Plan Rows": 1,
          "Parallel Aware": false,
          "Actual Total Time": 146502.016,
          "Output": [
            "log.day"
          ],
          "Parent Relationship": "InitPlan",
          "Actual Startup Time": 146502.016,
          "Plan Width": 8,
          "Subplan Name": "InitPlan 1 (returns $0)",
          "Actual Loops": 1,
          "Total Cost": 789.02
        }
      ],
      "Node Type": "Result",
      "Plan Rows": 1,
      "Parallel Aware": false,
      "Actual Total Time": 146502.019,
      "Output": [
        "$0"
      ],
      "Actual Startup Time": 146502.019,
      "Plan Width": 8,
      "Actual Loops": 1,
      "Total Cost": 789.03
    },
    "Triggers": []
  }
]

更奇怪的是，几乎类似的查询可以完美地工作。

select min(hours) from log where user_id = 'ab056f5a-390b-41d7-ba56-897c14b679bf'

Postgres首先为user_id ='ab056f5a-390b-41d7-ba56-897c14b679bf'选择条目，然后在其中聚合显然是正确的。

[
  {
    "Execution Time": 5.989,
    "Planning Time": 1.186,
    "Plan": {
      "Partial Mode": "Simple",
      "Startup Cost": 6842.66,
      "Actual Rows": 1,
      "Plans": [
        {
          "Startup Cost": 66.28,
          "Plan Width": 8,
          "Rows Removed by Index Recheck": 0,
          "Actual Rows": 745,
          "Plans": [
            {
              "Startup Cost": 0,
              "Plan Width": 0,
              "Actual Rows": 745,
              "Node Type": "Bitmap Index Scan",
              "Index Cond": "((log.user_id)::text = 'ab056f5a-390b-41d7-ba56-897c14b679bf'::text)",
              "Plan Rows": 1770,
              "Parallel Aware": false,
              "Actual Total Time": 0.25,
              "Parent Relationship": "Outer",
              "Actual Startup Time": 0.25,
              "Total Cost": 65.84,
              "Actual Loops": 1,
              "Index Name": "index_log_user_id"
            }
          ],
          "Recheck Cond": "((log.user_id)::text = 'ab056f5a-390b-41d7-ba56-897c14b679bf'::text)",
          "Exact Heap Blocks": 742,
          "Node Type": "Bitmap Heap Scan",
          "Plan Rows": 1770,
          "Relation Name": "log",
          "Alias": "log",
          "Parallel Aware": false,
          "Actual Total Time": 5.793,
          "Output": [
            "day",
            "hours",
            "user_id"
          ],
          "Lossy Heap Blocks": 0,
          "Parent Relationship": "Outer",
          "Actual Startup Time": 0.357,
          "Total Cost": 6838.23,
          "Actual Loops": 1,
          "Schema": "public"
        }
      ],
      "Node Type": "Aggregate",
      "Strategy": "Plain",
      "Plan Rows": 1,
      "Parallel Aware": false,
      "Actual Total Time": 5.946,
      "Output": [
        "min(hours)"
      ],
      "Actual Startup Time": 5.946,
      "Plan Width": 8,
      "Actual Loops": 1,
      "Total Cost": 6842.67
    },
    "Triggers": []
  }
]

有两种可能的解决方法：

1）将查询重写为：

select user_id, min(day) from log where user_id = 'ac43a155-4fbb-49eb-a670-02c307eb3d4f' group by user_id

2）引入对索引，就像在查找MAX（db_timestamp）查询中建议的

它们看起来不错，但是我认为这两种方法完全可以解决（第一个甚至是hack）。 从逻辑上讲，如果Postgres可以为“小时”选择适当的计划，则必须为“日”选择，但是事实并非如此。 因此，它看起来像是在时间戳字段汇总期间发生的Postgres错误，但是我承认我会错过一些东西。 有人可以告诉我是否可以在不使用WA的情况下完成某些工作，或者这确实是Postgres的错误，我必须报告一下吗？

UPD：我已经将此错误报告为PostgreSQL错误邮件列表。 我会让所有人知道它是否被接受。

Answer 1

Min是集合函数，不是运算符。 函数必须在所有匹配的记录上执行。 选择零件中的字段不会影响计划。 从...加入...，...按...分组，...按顺序-所有这些都在计划中考虑。 尝试：

select day from log where user_id = 'ab056f5a-390b-41d7-ba56-897c14b679bf'
order by user_id, day
limit 1

Answer 2

我收到了PostgreSQL的回复。 他们不认为它是错误。 在这种情况下可能存在WA，在原始帖子中以及稍后的评论中都提到了许多WA。 我的个人选择是最初提到的第一个选项，因为它不需要索引操纵（这并非总是可能的）。 因此，解决方案是将查询重写为：

select user_id, min(day) from log where user_id = 'ac43a155-4fbb-49eb-a670-02c307eb3d4f' group by user_id

Answer 3

看到这篇文章有一些索引的顺序-PostgreSQL索引不用于范围查询

https://dba.stackexchange.com/questions/39589/optimizing-queries-on-a-range-of-timestamps-two-columns

还有一个想法是

select min(day) from (
   select day from log 
      where user_id = 'ac43a155-4fbb-49eb-a670-02c307eb3d4f'
) q

ps另外，您可以确认已为该表执行了autovacuum (verbose, analyze)吗？

PostgreSQL 9.6针对时间戳列聚合期间选择了错误的计划

问题描述

3 个解决方案

解决方案1
1 2017-08-16 06:08:22

解决方案2
1 已采纳 2017-10-02 13:09:49

解决方案3
0 2017-08-15 20:15:25

PostgreSQL 9.6针对时间戳列聚合期间选择了错误的计划

问题描述

3 个解决方案

解决方案1 1 2017-08-16 06:08:22

解决方案2 1 已采纳 2017-10-02 13:09:49

解决方案3 0 2017-08-15 20:15:25

解决方案1
1 2017-08-16 06:08:22

解决方案2
1 已采纳 2017-10-02 13:09:49

解决方案3
0 2017-08-15 20:15:25