如何计算 PostgreSQL 列中的重复值？

Question

Hi I have a table like below, and I want to count the repeating values in the status column.嗨，我有一个如下表，我想计算状态列中的重复值。 I don't want to calculate the overall duplicate values.我不想计算整体重复值。 For example, I just want to count how many "Offline" appears until the value changes to "Idle".例如，我只想计算在值变为“空闲”之前出现了多少“离线”。

This is the result I wanted.这是我想要的结果。 Thank you.谢谢你。

Answer 1

This is often called gaps-and-islands.这通常称为间隙和孤岛。

One way to do it is with two sequences of row numbers.一种方法是使用两个行号序列。

Examine each intermediate result of the query to understand how it works.检查查询的每个中间结果以了解其工作原理。

WITH
CTE_rn
AS
(
    SELECT
        status
        ,dt
        ,ROW_NUMBER() OVER (ORDER BY dt) as rn1
        ,ROW_NUMBER() OVER (PARTITION BY status ORDER BY dt) as rn2
    FROM
        T
)
SELECT
    status
    ,COUNT(*) AS cnt
FROM
    CTE_rn
GROUP BY
    status
    ,rn1-rn2
ORDER BY
    min(dt)
;

Result结果

| status  | cnt |
|---------|-----|
| offline | 2   |
| idle    | 1   |
| offline | 2   |
| idle    | 1   |

Answer 2

WITH 
cte1 AS ( SELECT status, 
                 "date", 
                 workstation, 
                 CASE WHEN status = LAG(status) OVER (PARTITION BY workstation ORDER BY "date")
                      THEN 0
                      ELSE 1 END changed
          FROM test ),
cte2 AS ( SELECT status, 
                 "date", 
                 workstation, 
                 SUM(changed) OVER (PARTITION BY workstation ORDER BY "date") group_num 
          FROM cte1 )
SELECT status, COUNT(*) "count", workstation, MIN("date") "from", MAX("date") "till"
FROM cte2
GROUP BY group_num, status, workstation;

fiddle 小提琴

如何计算 PostgreSQL 列中的重复值？

问题描述

2 个解决方案

解决方案1
2 2020-03-04 05:40:08

解决方案2
0 2020-03-04 05:41:10

如何计算 PostgreSQL 列中的重复值？

问题描述

2 个解决方案

解决方案1 2 2020-03-04 05:40:08

解决方案2 0 2020-03-04 05:41:10

解决方案1
2 2020-03-04 05:40:08

解决方案2
0 2020-03-04 05:41:10