简体   繁体   English

Postgres:返回列的聚合函数

[英]Postgres: aggregation function that returns a column

In postgres, when I call a function on some data, like so: 在postgres中,当我对某些数据调用函数时,如下所示:

select f(col_nums) from tbl_name
where col_str = '12345'

then function f will be applied on each row where col_str = '12345' . 然后将函数f应用于col_str = '12345'每一行。

On the other hand, if I call an aggregation function on some data, like so: 另一方面,如果我对某些数据调用聚合函数,如下所示:

select g_agg(col_nums) from tbl_name
where col_str = '12345'

then the function g_agg will be called on the the entire column but will result in a single value. 那么将在整个列上调用函数g_agg但结果为单个值。

Q: How can I make a function that will be applied on the entire column and return a column of the same size while at the same time being aware of all the values in the the subset? 问:如何制作将应用于整个列并返回相同大小的列,同时又知道子集中所有值的函数?

For example, can I create a function to calculate cumulative sum? 例如,我可以创建一个函数来计算累计总和吗?

select *, sum_accum(col_nums) as cs from tbl_name
where col_str = '12345'

such that the result of the above query would look like this: 这样上述查询的结果将如下所示:

 col_str | more_cols | col_numbers | cs
---------+-----------+-------------+----
  12345  |    567    |     1       |  1
  12345  |    568    |     2       |  3
  12345  |    569    |     3       |  6
  12345  |    570    |     4       | 10

Is there no choice but to pass a sub-query result to a function and then join with the original table? 除了将子查询结果传递给函数然后与原始表join ,别无选择吗?

Use window functions 使用视窗功能

A window function performs a calculation across a set of table rows that are somehow related to the current row. 窗口函数跨一组与当前行相关的表行执行计算。 This is comparable to the type of calculation that can be done with an aggregate function. 这相当于可以使用聚合函数完成的计算类型。 But unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row — the rows retain their separate identities. 但是与常规的聚合函数不同,使用窗口函数不会导致行被分组为单个输出行,而是保持行的独立身份。 Behind the scenes, the window function is able to access more than just the current row of the query result. 在后台,窗口功能不仅可以访问查询结果的当前行,还可以访问更多内容。

eg 例如

select *, sum(col_nums) OVER(PARTITION BY T.COLX, T.COLY) as cs 
from tbl_name T
where col_str = '12345'

Note that it is the addition on a over clause that changes an aggregate from its traditional use to a window function : 请注意,是over clause ,将聚合从其传统用法更改为window function

the OVER clause causes it to be treated as a window function and computed across an appropriate set of rows OVER子句将其视为窗口函数,并在一组适当的行中进行计算

In the over clause has a partition by (analogous to group by ) which controls the window that the calculations are performed in; over clause有一个partition by (类似于group by ),该partition by控制执行计算的window and it also allows an order by which is valid for some functions but not all. 并且还允许对某些功能有效但并非全部有效的order by

select *
   -- running sum using an order by
 , sum(col_nums) OVER(PARTITION BY T.COLX ORDER BY T.COLY) as cs 

   -- but count does not permit ordering
 , count(*) OVER(PARTITION BY T.COLX) as cs_count
from tbl_name T
where col_str = '12345'

The function that you want is a cumulative sum. 您想要的函数是一个累加和。 This is handled by window functions: 这由窗口函数处理:

select t.*, sum(col_nums) over (order by more_cols) as cs
from tbl_name t
where col_str = '12345';

I am guessing that the order by sequence is defined by the second column. 我猜按顺序由第二列定义。 It can be any column including col_nums . 它可以是任何列,包括col_nums

You can do this for all values of col_str at the same time, using the partition by clause: 您可以使用partition by子句同时对col_str所有值执行此操作:

select t.*, sum(col_nums) over (partition by col_str order by more_cols) as cs
from tbl_name t

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM