简体   繁体   中英

count(distinct) over (partition by… doesn't work in Oracle SQL

在此输入图像描述

I want to count the distinct day_number over the past 30 days. However, the distinct function can't be used with over

If I delete distinct , it will give me the total count of the day_number , but day_number can have lots of duplicate. So that's why I want to add distinct .

select tr.*,
       count( distinct day_number) OVER (PARTITION BY ACCOUNT ORDER BY DAY_number range 29 PRECEDING) as result
from table tr;

Can anyone please show me how to count the distinct numbers in the over(partition by..) statement? Thanks in advance.

You can do this by first creating a column that only lists each id once, and then doing a range-count on that column, eg:

WITH sd AS (SELECT 1 ID, 10 val FROM dual UNION ALL
            SELECT 1 ID, 20 val FROM dual UNION ALL
            SELECT 2 ID, 30 val FROM dual UNION ALL
            SELECT 2 ID, 40 val FROM dual UNION ALL
            SELECT 4 ID, 50 val FROM dual UNION ALL
            SELECT 4 ID, 60 val FROM dual UNION ALL
            SELECT 6 ID, 70 val FROM dual)
SELECT ID,
       val,
       COUNT(id_distinct) OVER (ORDER BY ID RANGE 3 PRECEDING) cnt_disinct_ids
FROM   (SELECT ID,
               val,
               CASE WHEN row_number() OVER (PARTITION BY ID ORDER BY val) = 1 THEN ID END id_distinct
        FROM   sd);

        ID        VAL CNT_DISINCT_IDS
---------- ---------- ---------------
         1         10               1
         1         20               1
         2         30               2
         2         40               2
         4         50               3
         4         60               3
         6         70               2

ETA: proof that the above technique works for your data:

WITH your_table AS (SELECT 'ABCDE' account_sk, 23 day_sk FROM dual UNION ALL
                    SELECT 'ABCDE' account_sk, 23 day_sk FROM dual UNION ALL
                    SELECT 'ABCDE' account_sk, 24 day_sk FROM dual UNION ALL
                    SELECT 'ABCDE' account_sk, 25 day_sk FROM dual UNION ALL
                    SELECT 'ABCDE' account_sk, 53 day_sk FROM dual UNION ALL
                    SELECT 'ABCDE' account_sk, 53 day_sk FROM dual UNION ALL
                    SELECT 'ABCDE' account_sk, 55 day_sk FROM dual UNION ALL
                    SELECT 'VWXYZ' account_sk, 10 day_sk FROM dual UNION ALL
                    SELECT 'VWXYZ' account_sk, 12 day_sk FROM dual UNION ALL
                    SELECT 'VWXYZ' account_sk, 40 day_sk FROM dual UNION ALL
                    SELECT 'VWXYZ' account_sk, 40 day_sk FROM dual)
SELECT account_sk,
       day_sk,
       COUNT(day_sk_distinct) OVER (PARTITION BY account_sk ORDER BY day_sk RANGE BETWEEN 29 PRECEDING AND CURRENT ROW) count_distinct_day_sk
FROM   (SELECT account_sk,
               day_sk,
               CASE WHEN row_number() OVER (PARTITION BY account_sk, day_sk ORDER BY day_sk) = 1 THEN day_sk END day_sk_distinct
        FROM   your_table);

ACCOUNT_SK     DAY_SK COUNT_DISTINCT_DAY_SK
---------- ---------- ---------------------
ABCDE              23                     1
ABCDE              23                     1
ABCDE              24                     2
ABCDE              25                     3
ABCDE              53                     3
ABCDE              53                     3
ABCDE              55                     2
VWXYZ              10                     1
VWXYZ              12                     2
VWXYZ              40                     2
VWXYZ              40                     2

The count(distinct ...) works fine with the over clause, the main problem it's the order by. You can't do count (distinct ..) over (partition by ... order by ...) because DISTINCT functions and RATIO_TO_REPORT cannot have an ORDER BY. So i did this:

select tr.*, count (distinct day_number) over (partition by account)
from (select t.*, row_number() over (partition by account order by day_number) row_number from table t) tr
where row_number < 30;

I've tested it in employees scheme of HR (Free oracle scheme that you can find everywhere) I'm not sure it will works in your schema because i have not a replica of it, but, if not, it should give you some ideas:

 select count (distinct manager_id) over (partition by department_id), department_id, manager_id
from (select e.*, row_number() over (partition by department_id order by employee_id) row_number from employees e)
where row_number < 30;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM