简体   繁体   English

使用具有不同 order by 子句的 postgres window 函数

[英]Using postgres window functions with different order by clause

I have an issue using multiple order by in postgres window functions.我在 postgres window 函数中使用多个 order by 时遇到问题。 Here is short example.这是一个简短的例子。 Select total number of rows, N first and N last rows from in a single query. Select 单个查询中的总行数,N 首行和 N 行末行。 (It is not the task that I want to achieve just an example of the issue) Is it expected behavior or a bug in postgres? (这不是我想要实现的任务,只是问题的一个例子)它是预期的行为还是 postgres 中的错误? I'm using postgres 9.6我正在使用 postgres 9.6

select generate_series(1, 10) id
into q;

select
       count(*) over (),
       lag(id , 0) over (order by id asc) a,
       lag(id , 0) over (order by id desc) d
from q
limit 5;

Output: Output:

10,10,10
10,9,9
10,8,8
10,7,7
10,6,6

Expected:预期的:

10,1,10
10,2,9
10,3,8
10,4,7
10,5,6

Code work good if only N first or only N last rows are selected.如果只选择了 N 个第一行或 N 个最后一行,则代码运行良好。

I once had a similar problem which had the same explanation: https://stackoverflow.com/a/48668220/3984221我曾经遇到过类似的问题,解释相同: https://stackoverflow.com/a/48668220/3984221


Explanation of the behaviour:行为说明:

demo:db<>fiddle 演示:db<>小提琴

You can explain this when you have a look into the EXPLAIN output:当您查看 EXPLAIN output 时,您可以解释这一点:

> | QUERY PLAN                                                                                                                   |
> | :--------------------------------------------------------------------------------------------------------------------------- |
> | WindowAgg  (cost=368.69..445.19 rows=2550 width=20) (actual time=0.146..0.150 rows=10 loops=1)                               |
> |   -&gt;  WindowAgg  (cost=368.69..413.32 rows=2550 width=12) (actual time=0.128..0.136 rows=10 loops=1)                         |
> |         -&gt;  Sort  (cost=368.69..375.07 rows=2550 width=8) (actual time=0.126..0.128 rows=10 loops=1)                         |
> |               Sort Key: id                                                                                                   |
> |               Sort Method: quicksort  Memory: 25kB                                                                           |
> |               -&gt;  WindowAgg  (cost=179.78..224.41 rows=2550 width=8) (actual time=0.048..0.056 rows=10 loops=1)              |
> |                     -&gt;  Sort  (cost=179.78..186.16 rows=2550 width=4) (actual time=0.033..0.034 rows=10 loops=1)             |
> |                           Sort Key: id DESC                                                                                  |
> |                           Sort Method: quicksort  Memory: 25kB                                                               |
> |                           -&gt;  Seq Scan on q  (cost=0.00..35.50 rows=2550 width=4) (actual time=0.013..0.014 rows=10 loops=1) |
> | Planning Time: 0.292 ms                                                                                                      |
> | Execution Time: 0.445 ms                                                                                                     |

Here you can see: First there is a SORT Key: id DESC .在这里你可以看到:首先有一个SORT Key: id DESC So everything is ordered in DESC order.所以一切都按DESC顺序排列。 If you have only the DESC ordered function, this would be your result, as you already saw.如果您只订购了DESC function,这将是您的结果,正如您已经看到的。 Now, you have a second window function.现在,您有了第二个 window function。 So, the entire result will be sorted a second time, into the ASC order, incl.因此,整个结果将再次排序,按ASC顺序排列,包括。 your first result.你的第一个结果。 So, your first lag() result 10, 9, 8, 7, 6, ... will be ordered back into 1, 2, 3, 4, 5, ... Afterwards the second lag() result will be added.因此,您的第一个lag()结果10, 9, 8, 7, 6, ...将被重新排序为1, 2, 3, 4, 5, ...然后将添加第二个lag()结果。

However, your specific result for your lag() function is explainable, of course: You don't shift your data, so you get the current value.但是,您的lag() function 的具体结果当然是可以解释的:您不会转移数据,因此您会得到当前值。 You can cross check this (as I did in the fiddle above), when you turn your 0 shift value into 1 .当您将0移位值变为1时,您可以交叉检查这一点(就像我在上面的小提琴中所做的那样)。 Then your DESC lag() will return 2 for id 1 , but ASC gives NULL .然后您的DESC lag()将为id 1返回2 ,但ASC给出NULL Everything's fine.一切安好。

So, to create your expected output, you need another approach, eg using row_number() to add the row count in ASC and DESC order and filter them afterwards:因此,要创建您预期的 output,您需要另一种方法,例如使用row_number()ASCDESC顺序添加行数,然后过滤它们:

demo:db<>fiddle 演示:db<>小提琴

SELECT
    COUNT(*) OVER (),
    a.id,
    d.id
FROM ( 
   select
       id,
       row_number() over (order by id asc)
   from q
) a
JOIN ( 
   select
       id,
       row_number() over (order by id desc)
   from q
) d ON a.row_number = d.row_number
LIMIT 5

The second parameter of lag() determines how many rows to look back. lag()的第二个参数确定要回顾多少行。 So lag(id, 0) means to look back zero rows, which makes lag(id, 0) equivalent of just id .所以lag(id, 0)意味着回顾零行,这使得lag(id, 0)相当于id So the result you get is perfectly sane.所以你得到的结果是完全理智的。

Do get what you want, you can use row_number() to join on.得到你想要的,你可以使用row_number()加入。

SELECT count(*) OVER (),
       x1.id,
       x2.id
       FROM (SELECT id,
                    row_number() OVER (ORDER BY id ASC) r
                    FROM q) x1
            INNER JOIN (SELECT id,
                               row_number() OVER (ORDER BY id DESC) r
                               FROM q) x2
                       ON x2.r = x1.r
      ORDER BY x1.r
      LIMIT 5;

Order from one of the over clause are applied to the data.来自 over 子句之一的顺序应用于数据。 But data only sorted once.但数据只排序一次。 The desired behavior could be achieved by following query.通过以下查询可以实现所需的行为。

select count(*) over (),
       (array_agg(id) over (order by id asc rows between unbounded preceding and unbounded  following))[row_number() over ()] a,
       (array_agg(id) over (order by id desc rows between unbounded preceding and unbounded  following))[row_number() over ()] d
from q
order by id
limit 5

maybe memory ineffective for large tables because array_agg construct array from all rows.可能 memory 对大表无效,因为 array_agg 从所有行构造数组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM