[英]How to aggregate count of rows having the same value in a sequence?
我有一个查询,返回以下示例中的数据:
SELECT timestamp, atm_id FROM TRANSACTIONS ORDER BY TIMESTAMP ASC;
TIMESTAMP | ATM_ID |
--------------------
2010-01-01 | EP02 |
2010-01-01 | EP02 |
2010-01-28 | EP02 |
2010-02-07 | EP02 |
2010-02-09 | EP11 |
2010-03-19 | EP11 |
2010-03-19 | EP02 |
2010-04-03 | EP05 |
2010-04-30 | EP02 |
我知道如何按ATM_ID分组并将计数放在每个计数的前面
SELECT
ATM_ID,
COUNT(*) CNT
FROM
TRANSACTIONS
GROUP BY
ATM_ID;
根据上面的示例数据,这将产生类似
ATM_ID | CNT
---------------
EP02 | 6
EP11 | 2
EP05 | 1
但是,我有兴趣在不同级别进行分组。 如果某个ATM_ID在连续的行中重复,则即使输出相同的ATM_ID之后出现在另一个ATM_ID之后,也应在输出中包括顺序具有相同ATM_ID的行数。
ATM_ID | CNT
---------------
EP02 | 4 --Four rows of ATM_ID EP02
EP11 | 2 --Followed by 2 rows of ATM_ID EP11
EP02 | 1 --Followed by 1 row of ATM_ID EP02
EP05 | 1 --Followed by 1 row of ATM_ID EP05
EP02 | 1 --Followed by 1 row of ATM_ID EP02
忽略右边的注释,这些只是为了澄清,而不是输出的一部分。 那可能吗?
PS:Syed Aladeen在下面的回答给出的输出具有正确的计数,但顺序错误。 为了方便起见,我创建了一个SQL提琴:
尝试这个
select atm_id , count(*)
from (select TRANSACTIONS.*,
(row_number() over (order by id) -
row_number() over (partition by atm_id order by id)
) as grp
from TRANSACTIONS
) TRANSACTIONS
group by grp, atm_id order by max(id)
-- Oracle 12c+: pattern matching
with s(dt, atm_id) as (
select to_date('2010-01-01', 'yyyy-mm-dd'), 'EP02' from dual union all
select to_date('2010-01-01', 'yyyy-mm-dd'), 'EP02' from dual union all
select to_date('2010-01-28', 'yyyy-mm-dd'), 'EP02' from dual union all
select to_date('2010-02-07', 'yyyy-mm-dd'), 'EP02' from dual union all
select to_date('2010-02-09', 'yyyy-mm-dd'), 'EP11' from dual union all
select to_date('2010-03-19', 'yyyy-mm-dd'), 'EP11' from dual union all
select to_date('2010-03-19', 'yyyy-mm-dd'), 'EP02' from dual union all
select to_date('2010-04-03', 'yyyy-mm-dd'), 'EP05' from dual union all
select to_date('2010-04-30', 'yyyy-mm-dd'), 'EP02' from dual)
select *
from s
match_recognize (
order by dt
measures v.atm_id as atm_id,
count(v.atm_id) as cnt,
first(dt) as min_dt,
last (dt) as max_dt
pattern (v+)
define v as v.atm_id = first(atm_id)
);
ATM_ CNT MIN_DT MAX_DT
---- ---------- ------------------- -------------------
EP02 4 2010-01-01 00:00:00 2010-02-07 00:00:00
EP11 2 2010-02-09 00:00:00 2010-03-19 00:00:00
EP02 1 2010-03-19 00:00:00 2010-03-19 00:00:00
EP05 1 2010-04-03 00:00:00 2010-04-03 00:00:00
EP02 1 2010-04-30 00:00:00 2010-04-30 00:00:00
Elapsed: 00:00:00.01
-- Oracle 8i+: window sort + window buffer + group by [+ order by]
with s(dt, atm_id) as (
select to_date('2010-01-01', 'yyyy-mm-dd'), 'EP02' from dual union all
select to_date('2010-01-01', 'yyyy-mm-dd'), 'EP02' from dual union all
select to_date('2010-01-28', 'yyyy-mm-dd'), 'EP02' from dual union all
select to_date('2010-02-07', 'yyyy-mm-dd'), 'EP02' from dual union all
select to_date('2010-02-09', 'yyyy-mm-dd'), 'EP11' from dual union all
select to_date('2010-03-19', 'yyyy-mm-dd'), 'EP11' from dual union all
select to_date('2010-03-19', 'yyyy-mm-dd'), 'EP02' from dual union all
select to_date('2010-04-03', 'yyyy-mm-dd'), 'EP05' from dual union all
select to_date('2010-04-30', 'yyyy-mm-dd'), 'EP02' from dual)
select atm_id, count(*) cnt, min(dt) min_dt, max(dt) as max_dt
from
(select dt, atm_id, count(lg) over (order by dt) ct, lg
from
(select dt, atm_id, decode(atm_id, lag(atm_id) over (order by dt), null, 1) lg
from s
)
)
group by ct, atm_id
order by min_dt;
ATM_ CNT MIN_DT MAX_DT
---- ---------- ------------------- -------------------
EP02 4 2010-01-01 00:00:00 2010-02-07 00:00:00
EP11 1 2010-02-09 00:00:00 2010-02-09 00:00:00
EP02 1 2010-03-19 00:00:00 2010-03-19 00:00:00
EP11 1 2010-03-19 00:00:00 2010-03-19 00:00:00
EP05 1 2010-04-03 00:00:00 2010-04-03 00:00:00
EP02 1 2010-04-30 00:00:00 2010-04-30 00:00:00
6 rows selected.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.