簡體   English   中英

在 MySQL 8 中使用 window 函數獲取不同列的計數

[英]Getting count of distinct column with window functions in MySQL 8

我有一個 MVP DB 小提琴: https://www.db-fiddle.com/f/cUn1Lo2xhbTAUwwV5q9wKV/2

我正在嘗試使用 window 函數在任何日期獲取表中唯一shift_id的數量。

我嘗試使用COUNT(DISTINCT(shift_id))但目前 MySQL 8 不支持 window 函數。

以防萬一小提琴掉線。 這是測試架構:

CREATE TABLE `scores` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `shift_id` int unsigned NOT NULL,
  `employee_name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `score` double(8,2) unsigned NOT NULL,
  `created_at` timestamp NOT NULL,
  PRIMARY KEY (`id`)
);

INSERT INTO scores(shift_id, employee_name, score, created_at) VALUES
(1, "John", 6.72, "2020-04-01 00:00:00"),
(1, "Bob", 15.71, "2020-04-01 00:00:00"),
(1, "Bob", 54.02, "2020-04-01 00:00:00"),
(1, "John", 23.55, "2020-04-01 00:00:00"),

(2, "John", 9.13, "2020-04-02 00:00:00"),
(2, "Bob", 44.76, "2020-04-02 00:00:00"),
(2, "Bob", 33.40, "2020-04-02 00:00:00"),
(2, "James", 20, "2020-04-02 00:00:00"),

(3, "John", 20, "2020-04-02 00:00:00"),
(3, "Bob", 20, "2020-04-02 08:00:00"),
(3, "Bob", 30, "2020-04-02 08:00:00"),
(3, "James", 10, "2020-04-02 08:00:00")

我的查詢有兩種嘗試方法,使用我在這篇文章中看到的內容: Count distinct in window functions

SELECT
    ANY_VALUE(employee_name) AS `employee_name`,
    DATE(created_at) AS `shift_date`,
    COUNT(*) OVER (PARTITION BY ANY_VALUE(created_at), ANY_VALUE(shift_id)) AS `shifts_on_day_1`,

    (
        dense_rank() over (partition by ANY_VALUE(created_at) order by ANY_VALUE(shift_id) asc) +
        dense_rank() over (partition by ANY_VALUE(created_at) order by ANY_VALUE(shift_id) desc) - 1
    ) as `shifts_on_day_2`

FROM scores
    GROUP BY employee_name, DATE(created_at);

預期結果將是日期為 2020-04-01 的任何行的shifts_on_day為 1,而日期為 4 月 2 日的行的shifts_on_day為 2。

我考慮過使用相關子查詢,但這是一個性能噩夢,表中有數百萬行,查詢中返回數千行。

更新:我認為 window 函數的必要性是查詢中已經有一個 group by。 一個查詢中需要所有數據,最終目標是獲取每個員工在特定日期的平均得分。 要獲得每個員工的總分,我可以COUNT(*) 但是我需要將其除以一天中的總班次以獲得平均值。

更新

最終結果是能夠獲得表中每個員工每個日期的總行數除以該日期發生的錯誤總數 - 這將提供該日期每個員工的平均行數。

因此,預期結果是:

name  | shift_date | avrg
------+------------+-----
Bob   | 2020-04-01 | 2     2 / 1 = 2 ; two rows for Bob, one shift_id (1) that day
Bob   | 2020-04-02 | 2     4 / 2 = 2 ; four rows for Bob, two shift_ids (2,3) that day
James | 2020-04-02 | 1     2 / 2 = 1 ; two rows for James, two shift_ids (2,3) that day
John  | 2020-04-01 | 2     2 / 1 = 2 ; two rows for John, one shift_id (1) that day
John  | 2020-04-02 | 1     2 / 2 = 1 ; two rows for John, two shift_ids (2,3) that day

“每個日期和員工的所有行”和“每個日期的不同 ID 計數”是兩個完全不同的聚合; 您不能進行一個聚合並以某種方式從 elsewise 聚合行中檢索另一個聚合。 這規則 window 函數對聚合結果輸出。

您需要兩個單獨的聚合。 例如:

with empdays as
(
  select employee_name, date(created_at) as shift_date, count(*) as total
  from scores
  group by employee_name, date(created_at)
)
, days as 
(
  select date(created_at) as shift_date, count(distinct shift_id) as total
  from scores
  group by date(created_at)
)
select ed.employee_name, shift_date, ed.total / d.total as average
from empdays ed
join days d using (shift_date)
order by ed.employee_name, shift_date;

演示: https://www.db-fiddle.com/f/qjqbibriXtos6Hsi5qcwi6/0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM