如何计算一个值之前的值的聚合函数 COUNT(DISTINCT )？

Question

I have employee records on Google BigQuery containing : employee_identifier, manager_identifier and date_of_the_record我在 Google BigQuery 上有员工记录，其中包含：employee_identifier、manager_identifier 和 date_of_the_record

My goal is to calculate, through an SQL query, for each record, the number of managers an employee had until the date of the record.我的目标是通过 SQL 查询为每条记录计算员工在记录日期之前拥有的经理人数。

I have tried different clauses : OVER (PARTITION BY / ROWS UNBOUNDED PRECEDING), etc.我尝试了不同的子句：OVER (PARTITION BY / ROWS UNBOUNDED PRECEDING) 等。

What I have tried is :我尝试过的是：

SELECT 
  employee_identifier, 
  date_of_the_record,
  COUNT(DISTINCT manager_identifier) 
    OVER (PARTITION BY employee_identifier ORDER BY date_of_the_record ROWS UNBOUNDED PRECEDING) AS number_of_managers_until_date_of_the_record
FROM employee_database

but the DISTINCT clause is forbidden with ORDER BY .但ORDER BY禁止DISTINCT子句。

To sum it up, I just want the number of (distinct) managers an employee had until the date of the record.总而言之，我只想要员工在记录日期之前拥有的（不同的）经理的数量。

Answer 1

You could achieve this using a correlated sub-query , The following query should do what you want您可以使用correlated sub-query来实现这一点，以下查询应该做你想做的

CREATE TABLE #emp (employee_identifier INT,date_of_the_record DATE,manager_identifier INT)

INSERT INTO #emp VALUES
(1,getdate()-90,10),
(1,getdate()-80,20),
(1,getdate()-70,30),
(1,getdate()-60,10),
(1,getdate()-30,40),
(1,getdate()-20,80)

SELECT 
employee_identifier, 
date_of_the_record,
(SELECT COUNT(DISTINCT (manager_identifier)) FROM #emp e WHERE e.employee_identifier = emp.employee_identifier AND e.date_of_the_record <= emp.date_of_the_record) AS number_of_managers_until_date_of_the_record
FROM #emp emp
GROUP BY employee_identifier, 
date_of_the_record

The result is as below,结果如下，

employee_identifier date_of_the_record  number_of_managers_until_date_of_the_record
1                   2019-04-03          1
1                   2019-04-13          2
1                   2019-04-23          3
1                   2019-05-03          3
1                   2019-06-02          4
1                   2019-06-12          5

Answer 2

Below is for BigQuery Standard SQL下面是 BigQuery 标准 SQL

#standardSQL
SELECT * EXCEPT(arr),
  (SELECT COUNT(DISTINCT id) FROM UNNEST(arr) id) AS number_of_managers_until_date_of_the_record
FROM (
  SELECT *, ARRAY_AGG(manager_identifier) OVER(win) arr
  FROM `project.dataset.employee_database`
  WINDOW win AS (PARTITION BY employee_identifier ORDER BY date_of_the_record)
)

如何计算一个值之前的值的聚合函数 COUNT(DISTINCT )？

问题描述

2 个解决方案

解决方案1
2 已采纳 2019-07-02 04:35:37

解决方案2
0 2019-07-02 04:45:34

如何计算一个值之前的值的聚合函数 COUNT(DISTINCT )？

问题描述

2 个解决方案

解决方案1 2 已采纳 2019-07-02 04:35:37

解决方案2 0 2019-07-02 04:45:34

解决方案1
2 已采纳 2019-07-02 04:35:37

解决方案2
0 2019-07-02 04:45:34