简体   繁体   English

Postgres 中用于查找最后一个值的聚合

[英]Aggregation in Postgres for finding last value

I have a single table in postgres which holds aggregated data, the tables has the following fields我在 postgres 中有一个包含聚合数据的表,这些表具有以下字段

search_term --> a particular search term search_term --> 一个特定的搜索词
date --> a date when the search has been performed日期 --> 执行搜索的日期
search_count --> how many times search has been performed with this search term search_count --> 使用此搜索词执行了多少次搜索
min_result_count --> what was the minimum number of result returned by the search term min_result_count --> 搜索词返回的最小结果数是多少
max_results_count --> what was the maximum number of result returned by the search term max_results_count --> 搜索词返回的最大结果数是多少
last_result_count --> number of search result returned when last search was performed last_result_count --> 上次搜索时返回的搜索结果数
zero_result_count --> how mnay times there was no result for this search term zero_result_count --> 这个搜索词有多少次没有结果

where date and search_term combination is unique, meaning search term won't be repeated for the date rather the value would be updated.其中datesearch_term组合是唯一的,这意味着搜索词不会在日期重复,而是会更新值。

I am trying to write a sql query for the duration of 7 days to get the following record我正在尝试编写一个为期 7 天的 sql 查询以获得以下记录
search_term搜索词
min_result_count min_result_count
max_result_count max_result_count
zero_result_count zero_result_count
last_result_count last_result_count

I could find all the values using the aggregation MIN, MAX, SUM but I am unable to find the value for the last_result_count since this would require me to pick up the last value only.我可以使用聚合 MIN、MAX、SUM 找到所有值,但我无法找到 last_result_count 的值,因为这需要我只获取最后一个值。

Here is one same table with expected result这是一张具有预期结果的同一张表

search_term    search_count    min_rc    max_rc    zero_count    last_rc    date
---------------------------------------------------------------------------------------
term1          10              10        20        0              4        01-01-2020
term1          10              11        21        0              5        02-01-2020
term1          10              12        22        0              6        03-01-2020
term1          10              13        23        0              7        04-01-2020
term1          10              14        24        0              8        05-01-2020

term2          10              24        25        0              9        01-01-2020
term2          10              23        26        0              10       02-01-2020
term2          10              22        27        0              11       03-01-2020
term2          10              21        28        0              12       04-01-2020
term2          10              0         29        3              0        04-01-2020

if I run the query 05-01-2020, I should get如果我运行查询 05-01-2020,我应该得到

search_term    search_count    min_rc    max_rc    zero_count    last_rc
-------------------------------------------------------------------------
term1          50              10        24        0              8      
term2          50              0         29        3              0     

if I run the query 04-01-2020, I should get如果我运行查询 04-01-2020,我应该得到

search_term    search_count    min_rc    max_rc    zero_count    last_rc
-------------------------------------------------------------------------
term1          40              10        23        0              7      
term2          40              21        28        0              12     

if I run the query 03-01-2020, I should get如果我运行查询 03-01-2020,我应该得到

search_term    search_count    min_rc    max_rc    zero_count    last_rc
-------------------------------------------------------------------------
term1          30              10        23        0              6      
term2          30              22        27        0              11     
  • rc stands for result_count rc 代表 result_count

and so on, any help to derive last_result_count would be really helpful依此类推,派生 last_result_count 的任何帮助都会非常有帮助

You can use ROW_NUMBER window function for this.您可以为此使用ROW_NUMBER window function。 ROW_NUMBER orders your data with your inted then generates a number. ROW_NUMBER使用您的 int 对您的数据进行排序,然后生成一个数字。

ROW_NUMBER()OVER(PARTITION BY date,search_term ORDER BY LAST_RC) AS ROW_NUMBERED_COLUMN

You can then group your data and use MAX(ROW_NUMBERED_COLUMN)然后,您可以对数据进行分组并使用MAX(ROW_NUMBERED_COLUMN)

You could use window_functions like below.您可以使用如下所示的window_functions

Select search_term ,
SUM(search_count) OVER (partition by search_term order BY date)  as search_count,
MIN(min_rc) OVER (partition by search_term order BY date)  as min_rc,
MAX(max_rc) OVER (partition by search_term order BY date)  as max_rc,
zero_count,
last_rc , 
DATE 
from t
ORDER BY search_term,date 

Result set:结果集:

search_term    search_count    min_rc    max_rc    zero_count    last_rc   date
term1          10              10        20         0              4       01-01-2020
term1          20              10        21         0              5       02-01-2020
term1          30              10        22         0              6       03-01-2020
term1          40              10        23         0              7       04-01-2020
term1          50              10        24         0              8       05-01-2020
term2          10              24        25         0              9       01-01-2020
term2          20              23        26         0              10      02-01-2020
term2          30              22        27         0              11      03-01-2020
term2          50              0         29         0              12      04-01-2020
term2          50              0         29         3              0       04-01-2020

updated version*更新后的版本*

SELECT search_term,search_count, min_rc, max_rc, zero_count, last_rc
FROM
(SELECT search_term ,
        SUM(search_count) OVER (partition by search_term order BY date) as search_count,
        MIN(min_rc) OVER (partition by search_term order BY date) as min_rc,
        MAX(max_rc) OVER (partition by search_term order BY date) as max_rc,
        zero_count,
        last_rc,
        RANK() OVER (partition by search_term order BY date desc) as rnk,
        date
 FROM t
 WHERE date <= '05-01-2020'
 ) A 
 WHERE A.rnk = 1

Another method which is simpler and I realized what you wanted after your comment.另一种更简单的方法,我在您发表评论后意识到您想要什么。

SELECT search_term ,
SUM(search_count) as search_count,
MIN(min_rc) as min_rc,
MAX(max_rc) as max_rc,
SUM(zero_count) as zero_count,
(SELECT last_rc FROM t as a WHERE a.search_term = t.search_term AND a.date = 
 t.date ORDER BY date desc LIMIT 1) AS last_rc,
MAX(date) as date
FROM t
WHERE date <= '05-01-2020'
GROUP BY search_term
ORDER BY search_term

This is even more simple using window function last_value使用 window function last_value 更简单

Select search_term ,
SUM(search_count) as search_count,
MIN(min_rc) as min_rc,
MAX(max_rc) as max_rc,
SUM(zero_count) as zero_count,
LAST_VALUE(last_rc) OVER (Partition by search_term ORDER BY date desc) AS last_rc,
MAX(date) as date
FROM t
WHERE date <= '03-01-2020'
GROUP BY search_term
ORDER BY search_term

Result set using any of the updated versions.使用任何更新版本的结果集。

search_term search_count    min_rc  max_rc  zero_count  last_rc
term1       50              10      24      0           8
term2       50              0       29      3           0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM