PostgreSQL 9.6中窗口功能中的不稳定查询行为

Question

I downloaded OSM data for my country from geofabrik.de, successfully imported it to PostgreSQL 9.6 installed on Ubuntu 16.04 and used it for several times. 我从geofabrik.de下载了我所在国家/地区的OSM数据，并将其成功导入到Ubuntu 16.04上安装的PostgreSQL 9.6，并多次使用。 I also created web application, which works correctly. 我还创建了可正常运行的Web应用程序。 So I decided to add another functionality which returns top nearest special points (eg restaurants) from some points. 因此，我决定添加另一个功能，该功能可以从某些点返回最接近的特殊点（例如餐馆）。 For one nearest point it works, but when I want return array of them, it doesn't work. 对于最近的一点，它可以工作，但是当我想要返回它们的数组时，它不起作用。 So I decomposed my problem and found strange behavior. 所以我分解了问题，发现了奇怪的行为。 When I executed following query: 当我执行以下查询时：

SELECT t.osm_id
      FROM (
        SELECT DISTINCT ON (a.points) a.points, v.osm_id AS osm_id, MIN(ST_DISTANCE(v.the_geom, a.points)) OVER (PARTITION BY a.points ORDER BY ST_DISTANCE(v.the_geom, a.points))
        FROM (SELECT ST_GEOMFROMEWKT('SRID=4326;POINT(17.104854583740238 48.15099866770469)') AS points) a
        CROSS JOIN ways_vertices_pgr v
      ) AS t

it returns: 它返回：

| osm_id            |
| ----------------- |
| 2338524511        |

When I displayed this point on map, it is placed far away from original point and after I changed point in subquery, the result remains same. 当我在地图上显示此点时，它的位置离原始点很远，并且在子查询中更改了点后，结果仍然相同。 Also I know there are many points between displayed and original point, which should be returned by query. 我也知道在显示点和原始点之间有很多点，应该通过查询返回。 Then I tried run following query: 然后我尝试运行以下查询：

SELECT t.*, t.osm_id
      FROM (
        SELECT DISTINCT ON (a.points) a.points, v.osm_id AS osm_id, MIN(ST_DISTANCE(v.the_geom, a.points)) OVER (PARTITION BY a.points ORDER BY ST_DISTANCE(v.the_geom, a.points))
        FROM (SELECT ST_GEOMFROMEWKT('SRID=4326;POINT(17.104854583740238 48.15099866770469)') AS points) a
        CROSS JOIN ways_vertices_pgr v
      ) AS t

and it returns: 它返回：

| points                                             | osm_id   | min                  | osm_id     |
| -------------------------------------------------- | -------- | -------------------- | --------   |
| 0101000020E6100000010000C0D71A3140FFC3A1EC53134840 | 33169309 | 0.000124886435658481 | 33169309   |

Whole query except SELECT part remains same, but result is different and now it is correct. 除了SELECT部分之外的整个查询保持不变，但是结果不同，现在是正确的。 Can anyone suggest me how to change query to works properly? 谁能建议我如何更改查询以使其正常工作？

Answer 1

When you use distinct on , you need an order by . 当您使用distinct on ，需要按order by 。 I think this is the logic you want for the first query: 我认为这是您想要的第一个查询逻辑：

    SELECT DISTINCT ON (a.points) a.points, v.osm_id AS osm_id,ST_DISTANCE(v.the_geom, a.points) as dist
    FROM (SELECT ST_GEOMFROMEWKT('SRID=4326;POINT(17.104854583740238 48.15099866770469)') AS points) a CROSS JOIN
         ways_vertices_pgr v
    ORDER BY a.points, dist;

Answer 2

Check the output of EXPLAIN ANALYZE with your query to see exactly why the results are changing when you add the columns. 通过查询检查EXPLAIN ANALYZE的输出，以确切了解添加列时结果为何更改。 Likely it's using a slightly different execution plan which affects the ordering of rows. 可能它使用的执行计划略有不同，这会影响行的顺序。

DISTINCT ON is by definition non-deterministic, meaning the results can change between executions. DISTINCT ON根据定义是不确定的，这意味着结果可以在两次执行之间改变。 From the PostgreSQL 9.6 manual : 从PostgreSQL 9.6手册：

SELECT DISTINCT ON ... Note that the "first row" of a set is unpredictable unless the query is sorted on enough columns to guarantee a unique ordering of the rows arriving at the DISTINCT filter. SELECT DISTINCT ON ...请注意，除非查询在足够的列上排序以保证到达DISTINCT过滤器的行的唯一顺序，否则集合的“第一行”是不可预测的。 (DISTINCT ON processing occurs after ORDER BY sorting.) （在ORDER BY排序之后进行DISTINCT ON处理。）

Adding an ORDER BY as Gordon suggested should give you repeatable results. 按照戈登的建议添加ORDER BY应该会给您带来可重复的结果。

PostgreSQL 9.6中窗口功能中的不稳定查询行为

问题描述

2 个解决方案

解决方案1
2 已采纳 2017-12-26 11:42:25

解决方案2
2 2017-12-26 19:02:13

PostgreSQL 9.6中窗口功能中的不稳定查询行为

问题描述

2 个解决方案

解决方案1 2 已采纳 2017-12-26 11:42:25

解决方案2 2 2017-12-26 19:02:13

解决方案1
2 已采纳 2017-12-26 11:42:25

解决方案2
2 2017-12-26 19:02:13