繁体   English   中英

Postgres 中用于查找最后一个值的聚合

[英]Aggregation in Postgres for finding last value

我在 postgres 中有一个包含聚合数据的表,这些表具有以下字段

search_term --> 一个特定的搜索词
日期 --> 执行搜索的日期
search_count --> 使用此搜索词执行了多少次搜索
min_result_count --> 搜索词返回的最小结果数是多少
max_results_count --> 搜索词返回的最大结果数是多少
last_result_count --> 上次搜索时返回的搜索结果数
zero_result_count --> 这个搜索词有多少次没有结果

其中datesearch_term组合是唯一的,这意味着搜索词不会在日期重复,而是会更新值。

我正在尝试编写一个为期 7 天的 sql 查询以获得以下记录
搜索词
min_result_count
max_result_count
zero_result_count
last_result_count

我可以使用聚合 MIN、MAX、SUM 找到所有值,但我无法找到 last_result_count 的值,因为这需要我只获取最后一个值。

这是一张具有预期结果的同一张表

search_term    search_count    min_rc    max_rc    zero_count    last_rc    date
---------------------------------------------------------------------------------------
term1          10              10        20        0              4        01-01-2020
term1          10              11        21        0              5        02-01-2020
term1          10              12        22        0              6        03-01-2020
term1          10              13        23        0              7        04-01-2020
term1          10              14        24        0              8        05-01-2020

term2          10              24        25        0              9        01-01-2020
term2          10              23        26        0              10       02-01-2020
term2          10              22        27        0              11       03-01-2020
term2          10              21        28        0              12       04-01-2020
term2          10              0         29        3              0        04-01-2020

如果我运行查询 05-01-2020,我应该得到

search_term    search_count    min_rc    max_rc    zero_count    last_rc
-------------------------------------------------------------------------
term1          50              10        24        0              8      
term2          50              0         29        3              0     

如果我运行查询 04-01-2020,我应该得到

search_term    search_count    min_rc    max_rc    zero_count    last_rc
-------------------------------------------------------------------------
term1          40              10        23        0              7      
term2          40              21        28        0              12     

如果我运行查询 03-01-2020,我应该得到

search_term    search_count    min_rc    max_rc    zero_count    last_rc
-------------------------------------------------------------------------
term1          30              10        23        0              6      
term2          30              22        27        0              11     
  • rc 代表 result_count

依此类推,派生 last_result_count 的任何帮助都会非常有帮助

您可以为此使用ROW_NUMBER window function。 ROW_NUMBER使用您的 int 对您的数据进行排序,然后生成一个数字。

ROW_NUMBER()OVER(PARTITION BY date,search_term ORDER BY LAST_RC) AS ROW_NUMBERED_COLUMN

然后,您可以对数据进行分组并使用MAX(ROW_NUMBERED_COLUMN)

您可以使用如下所示的window_functions

Select search_term ,
SUM(search_count) OVER (partition by search_term order BY date)  as search_count,
MIN(min_rc) OVER (partition by search_term order BY date)  as min_rc,
MAX(max_rc) OVER (partition by search_term order BY date)  as max_rc,
zero_count,
last_rc , 
DATE 
from t
ORDER BY search_term,date 

结果集:

search_term    search_count    min_rc    max_rc    zero_count    last_rc   date
term1          10              10        20         0              4       01-01-2020
term1          20              10        21         0              5       02-01-2020
term1          30              10        22         0              6       03-01-2020
term1          40              10        23         0              7       04-01-2020
term1          50              10        24         0              8       05-01-2020
term2          10              24        25         0              9       01-01-2020
term2          20              23        26         0              10      02-01-2020
term2          30              22        27         0              11      03-01-2020
term2          50              0         29         0              12      04-01-2020
term2          50              0         29         3              0       04-01-2020

更新后的版本*

SELECT search_term,search_count, min_rc, max_rc, zero_count, last_rc
FROM
(SELECT search_term ,
        SUM(search_count) OVER (partition by search_term order BY date) as search_count,
        MIN(min_rc) OVER (partition by search_term order BY date) as min_rc,
        MAX(max_rc) OVER (partition by search_term order BY date) as max_rc,
        zero_count,
        last_rc,
        RANK() OVER (partition by search_term order BY date desc) as rnk,
        date
 FROM t
 WHERE date <= '05-01-2020'
 ) A 
 WHERE A.rnk = 1

另一种更简单的方法,我在您发表评论后意识到您想要什么。

SELECT search_term ,
SUM(search_count) as search_count,
MIN(min_rc) as min_rc,
MAX(max_rc) as max_rc,
SUM(zero_count) as zero_count,
(SELECT last_rc FROM t as a WHERE a.search_term = t.search_term AND a.date = 
 t.date ORDER BY date desc LIMIT 1) AS last_rc,
MAX(date) as date
FROM t
WHERE date <= '05-01-2020'
GROUP BY search_term
ORDER BY search_term

使用 window function last_value 更简单

Select search_term ,
SUM(search_count) as search_count,
MIN(min_rc) as min_rc,
MAX(max_rc) as max_rc,
SUM(zero_count) as zero_count,
LAST_VALUE(last_rc) OVER (Partition by search_term ORDER BY date desc) AS last_rc,
MAX(date) as date
FROM t
WHERE date <= '03-01-2020'
GROUP BY search_term
ORDER BY search_term

使用任何更新版本的结果集。

search_term search_count    min_rc  max_rc  zero_count  last_rc
term1       50              10      24      0           8
term2       50              0       29      3           0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM