[英]How to calculate aggregate function COUNT(DISTINCT ) over values previous to one value?
我在 Google BigQuery 上有員工記錄,其中包含:employee_identifier、manager_identifier 和 date_of_the_record
我的目標是通過 SQL 查詢為每條記錄計算員工在記錄日期之前擁有的經理人數。
我嘗試了不同的子句:OVER (PARTITION BY / ROWS UNBOUNDED PRECEDING) 等。
我嘗試過的是:
SELECT
employee_identifier,
date_of_the_record,
COUNT(DISTINCT manager_identifier)
OVER (PARTITION BY employee_identifier ORDER BY date_of_the_record ROWS UNBOUNDED PRECEDING) AS number_of_managers_until_date_of_the_record
FROM employee_database
但ORDER BY
禁止DISTINCT
子句。
總而言之,我只想要員工在記錄日期之前擁有的(不同的)經理的數量。
您可以使用correlated sub-query
來實現這一點,以下查詢應該做你想做的
CREATE TABLE #emp (employee_identifier INT,date_of_the_record DATE,manager_identifier INT)
INSERT INTO #emp VALUES
(1,getdate()-90,10),
(1,getdate()-80,20),
(1,getdate()-70,30),
(1,getdate()-60,10),
(1,getdate()-30,40),
(1,getdate()-20,80)
SELECT
employee_identifier,
date_of_the_record,
(SELECT COUNT(DISTINCT (manager_identifier)) FROM #emp e WHERE e.employee_identifier = emp.employee_identifier AND e.date_of_the_record <= emp.date_of_the_record) AS number_of_managers_until_date_of_the_record
FROM #emp emp
GROUP BY employee_identifier,
date_of_the_record
結果如下,
employee_identifier date_of_the_record number_of_managers_until_date_of_the_record
1 2019-04-03 1
1 2019-04-13 2
1 2019-04-23 3
1 2019-05-03 3
1 2019-06-02 4
1 2019-06-12 5
下面是 BigQuery 標准 SQL
#standardSQL
SELECT * EXCEPT(arr),
(SELECT COUNT(DISTINCT id) FROM UNNEST(arr) id) AS number_of_managers_until_date_of_the_record
FROM (
SELECT *, ARRAY_AGG(manager_identifier) OVER(win) arr
FROM `project.dataset.employee_database`
WINDOW win AS (PARTITION BY employee_identifier ORDER BY date_of_the_record)
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.