简体   繁体   English

根据 PostgreSQL 中最近的值组合多行

[英]Combining multiple rows based on recent values in PostgreSQL

My first question on here, so i will try to explain it good.我在这里的第一个问题,所以我会尽力解释它。

I have a specific need which i tried to come up with a query but din't succeed to.我有一个特定的需求,我试图提出一个查询,但没有成功。 Also googled it, and did not find it, but probably my input was not good, as it does not seem to me it should be that hard.也用谷歌搜索,并没有找到它,但可能我的输入不好,因为在我看来它应该没有那么难。

So some example of table and data i have (dates are in format here dd/MM/yyyy):所以我有一些表格和数据的例子(日期格式为 dd/MM/yyyy):

----------------------------------------------------------------------------
|   id    |   asset_id   |    value    |    start_date    |    end_date    |
----------------------------------------------------------------------------
|    1    |       1      |    value1   |    20-10-2020    |   31-10-2020   |
----------------------------------------------------------------------------
|    1    |       1      |    value1   |    01-11-2020    |   05-11-2020   |
----------------------------------------------------------------------------
|    1    |       2      |    value2   |    05-10-2020    |   10-10-2020   |
----------------------------------------------------------------------------
|    1    |       2      |    value3   |    10-10-2020    |   15-10-2020   |
----------------------------------------------------------------------------
|    1    |       3      |    value3   |    15-08-2020    |   31-08-2020   |
----------------------------------------------------------------------------
|    1    |       3      |    value1   |    01-09-2020    |   05-09-2020   |
----------------------------------------------------------------------------
|    1    |       3      |    value1   |    05-09-2020    |   10-09-2020   |
----------------------------------------------------------------------------

So the specific need i have is to look at the two most recent rows grouped by id and asset_id.所以我的具体需要是查看按 id 和 asset_id 分组的最近行。 If the value of these two rows is the same, then combine the rows into one, with the start_date from the first row and end_date of the second one.如果这两行的值相同,则将行合并为一行,第一行的 start_date 和第二行的 end_date。 If the values do not match, then nothing should be done.如果值不匹配,则不应执行任何操作。

For the specific input (previous table), some desired output should be:对于特定的输入(上表),一些期望的输出应该是:

----------------------------------------------------------------------------
|   id    |   asset_id   |    value    |    start_date    |    end_date    |
----------------------------------------------------------------------------
|    1    |       1      |    value1   |    20-10-2020    |   05-11-2020   |
----------------------------------------------------------------------------
|    1    |       2      |    value2   |    05-10-2020    |   10-10-2020   |
----------------------------------------------------------------------------
|    1    |       2      |    value3   |    10-10-2020    |   15-10-2020   |
----------------------------------------------------------------------------
|    1    |       3      |    value3   |    15-08-2020    |   31-08-2020   |
----------------------------------------------------------------------------
|    1    |       3      |    value3   |    01-09-2020    |   10-09-2020   |
----------------------------------------------------------------------------

So for the group (id, asset_id) where the values are (1,1), two rows form the input table should be combined as i described as their value is the same.因此,对于值为 (1,1) 的组 (id, asset_id),输入表中的两行应按我的描述进行组合,因为它们的值相同。 So the 1st and 2nd row should combine to the 1st row from the output.所以第一行和第二行应该从输出合并到第一行。 For the (1,2) group, the values are different, so no combining should be done.对于 (1,2) 组,​​值不同,因此不应进行合并。 For the (1,3) group, the two most recent rows (the 6th and 7th from the input) should combine in the 5th in the output table.对于 (1,3) 组,最近的行(输入的第 6 行和第 7 行)应合并在输出表的第 5 行中。

It seems not hard, but i have trouble to come with something specific.这似乎并不难,但我很难提出一些具体的问题。 I made an sqlfiddle where anyone can try.我做了一个sqlfiddle ,任何人都可以尝试。

Any help really appreciated.任何帮助真的很感激。

You can filter the top two rows per group with row_number() .您可以使用row_number()过滤每组的前两行。 Then, aggregate by value : if both rows in the group have the same value , they are grouped together, else then end up in two different groups.然后,按value聚合:如果组中的两行具有相同的value ,则将它们分组在一起,否则最终分为两个不同的组。

So:所以:

select id, asset_id, value, min(start_date) start_date, max(end_date) end_date
from (
    select t.*,
        row_number() over(partition by id, asset_id order by start_date desc) rn
    from mytable t
) t
where rn <= 2
group by id, asset_id, value

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM