[英]Aggregation in Postgres for finding last value
我在 postgres 中有一个包含聚合数据的表,这些表具有以下字段
search_term --> 一个特定的搜索词
日期 --> 执行搜索的日期
search_count --> 使用此搜索词执行了多少次搜索
min_result_count --> 搜索词返回的最小结果数是多少
max_results_count --> 搜索词返回的最大结果数是多少
last_result_count --> 上次搜索时返回的搜索结果数
zero_result_count --> 这个搜索词有多少次没有结果
其中date和search_term组合是唯一的,这意味着搜索词不会在日期重复,而是会更新值。
我正在尝试编写一个为期 7 天的 sql 查询以获得以下记录
搜索词
min_result_count
max_result_count
zero_result_count
last_result_count
我可以使用聚合 MIN、MAX、SUM 找到所有值,但我无法找到 last_result_count 的值,因为这需要我只获取最后一个值。
这是一张具有预期结果的同一张表
search_term search_count min_rc max_rc zero_count last_rc date --------------------------------------------------------------------------------------- term1 10 10 20 0 4 01-01-2020 term1 10 11 21 0 5 02-01-2020 term1 10 12 22 0 6 03-01-2020 term1 10 13 23 0 7 04-01-2020 term1 10 14 24 0 8 05-01-2020 term2 10 24 25 0 9 01-01-2020 term2 10 23 26 0 10 02-01-2020 term2 10 22 27 0 11 03-01-2020 term2 10 21 28 0 12 04-01-2020 term2 10 0 29 3 0 04-01-2020
如果我运行查询 05-01-2020,我应该得到
search_term search_count min_rc max_rc zero_count last_rc ------------------------------------------------------------------------- term1 50 10 24 0 8 term2 50 0 29 3 0
如果我运行查询 04-01-2020,我应该得到
search_term search_count min_rc max_rc zero_count last_rc ------------------------------------------------------------------------- term1 40 10 23 0 7 term2 40 21 28 0 12
如果我运行查询 03-01-2020,我应该得到
search_term search_count min_rc max_rc zero_count last_rc ------------------------------------------------------------------------- term1 30 10 23 0 6 term2 30 22 27 0 11
依此类推,派生 last_result_count 的任何帮助都会非常有帮助
您可以为此使用ROW_NUMBER window function。 ROW_NUMBER
使用您的 int 对您的数据进行排序,然后生成一个数字。
ROW_NUMBER()OVER(PARTITION BY date,search_term ORDER BY LAST_RC) AS ROW_NUMBERED_COLUMN
然后,您可以对数据进行分组并使用MAX(ROW_NUMBERED_COLUMN)
您可以使用如下所示的window_functions 。
Select search_term ,
SUM(search_count) OVER (partition by search_term order BY date) as search_count,
MIN(min_rc) OVER (partition by search_term order BY date) as min_rc,
MAX(max_rc) OVER (partition by search_term order BY date) as max_rc,
zero_count,
last_rc ,
DATE
from t
ORDER BY search_term,date
结果集:
search_term search_count min_rc max_rc zero_count last_rc date
term1 10 10 20 0 4 01-01-2020
term1 20 10 21 0 5 02-01-2020
term1 30 10 22 0 6 03-01-2020
term1 40 10 23 0 7 04-01-2020
term1 50 10 24 0 8 05-01-2020
term2 10 24 25 0 9 01-01-2020
term2 20 23 26 0 10 02-01-2020
term2 30 22 27 0 11 03-01-2020
term2 50 0 29 0 12 04-01-2020
term2 50 0 29 3 0 04-01-2020
更新后的版本*
SELECT search_term,search_count, min_rc, max_rc, zero_count, last_rc
FROM
(SELECT search_term ,
SUM(search_count) OVER (partition by search_term order BY date) as search_count,
MIN(min_rc) OVER (partition by search_term order BY date) as min_rc,
MAX(max_rc) OVER (partition by search_term order BY date) as max_rc,
zero_count,
last_rc,
RANK() OVER (partition by search_term order BY date desc) as rnk,
date
FROM t
WHERE date <= '05-01-2020'
) A
WHERE A.rnk = 1
另一种更简单的方法,我在您发表评论后意识到您想要什么。
SELECT search_term ,
SUM(search_count) as search_count,
MIN(min_rc) as min_rc,
MAX(max_rc) as max_rc,
SUM(zero_count) as zero_count,
(SELECT last_rc FROM t as a WHERE a.search_term = t.search_term AND a.date =
t.date ORDER BY date desc LIMIT 1) AS last_rc,
MAX(date) as date
FROM t
WHERE date <= '05-01-2020'
GROUP BY search_term
ORDER BY search_term
使用 window function last_value 更简单
Select search_term ,
SUM(search_count) as search_count,
MIN(min_rc) as min_rc,
MAX(max_rc) as max_rc,
SUM(zero_count) as zero_count,
LAST_VALUE(last_rc) OVER (Partition by search_term ORDER BY date desc) AS last_rc,
MAX(date) as date
FROM t
WHERE date <= '03-01-2020'
GROUP BY search_term
ORDER BY search_term
使用任何更新版本的结果集。
search_term search_count min_rc max_rc zero_count last_rc
term1 50 10 24 0 8
term2 50 0 29 3 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.