简体   繁体   English

如何计算一个值之前的值的聚合函数 COUNT(DISTINCT )?

[英]How to calculate aggregate function COUNT(DISTINCT ) over values previous to one value?

I have employee records on Google BigQuery containing : employee_identifier, manager_identifier and date_of_the_record我在 Google BigQuery 上有员工记录,其中包含:employee_identifier、manager_identifier 和 date_of_the_record

My goal is to calculate, through an SQL query, for each record, the number of managers an employee had until the date of the record.我的目标是通过 SQL 查询为每条记录计算员工在记录日期之前拥有的经理人数。

I have tried different clauses : OVER (PARTITION BY / ROWS UNBOUNDED PRECEDING), etc.我尝试了不同的子句:OVER (PARTITION BY / ROWS UNBOUNDED PRECEDING) 等。

What I have tried is :我尝试过的是:

SELECT 
  employee_identifier, 
  date_of_the_record,
  COUNT(DISTINCT manager_identifier) 
    OVER (PARTITION BY employee_identifier ORDER BY date_of_the_record ROWS UNBOUNDED PRECEDING) AS number_of_managers_until_date_of_the_record
FROM employee_database

but the DISTINCT clause is forbidden with ORDER BY .ORDER BY禁止DISTINCT子句。

To sum it up, I just want the number of (distinct) managers an employee had until the date of the record.总而言之,我只想要员工在记录日期之前拥有的(不同的)经理的数量。

You could achieve this using a correlated sub-query , The following query should do what you want您可以使用correlated sub-query来实现这一点,以下查询应该做你想做的

CREATE TABLE #emp (employee_identifier INT,date_of_the_record DATE,manager_identifier INT)

INSERT INTO #emp VALUES
(1,getdate()-90,10),
(1,getdate()-80,20),
(1,getdate()-70,30),
(1,getdate()-60,10),
(1,getdate()-30,40),
(1,getdate()-20,80)

SELECT 
employee_identifier, 
date_of_the_record,
(SELECT COUNT(DISTINCT (manager_identifier)) FROM #emp e WHERE e.employee_identifier = emp.employee_identifier AND e.date_of_the_record <= emp.date_of_the_record) AS number_of_managers_until_date_of_the_record
FROM #emp emp
GROUP BY employee_identifier, 
date_of_the_record

The result is as below,结果如下,

employee_identifier date_of_the_record  number_of_managers_until_date_of_the_record
1                   2019-04-03          1
1                   2019-04-13          2
1                   2019-04-23          3
1                   2019-05-03          3
1                   2019-06-02          4
1                   2019-06-12          5

Below is for BigQuery Standard SQL下面是 BigQuery 标准 SQL

#standardSQL
SELECT * EXCEPT(arr),
  (SELECT COUNT(DISTINCT id) FROM UNNEST(arr) id) AS number_of_managers_until_date_of_the_record
FROM (
  SELECT *, ARRAY_AGG(manager_identifier) OVER(win) arr
  FROM `project.dataset.employee_database`
  WINDOW win AS (PARTITION BY employee_identifier ORDER BY date_of_the_record)
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何计算两个类别的不同值? - How to count distinct values over two categories? BigQuery:如何在窗口函数上合并HLL草图? (在滚动窗口中计数不同的值) - BigQuery: How to merge HLL Sketches over a window function? (Count distinct values over a rolling window) 如何在SAS Visual Analytics中仅汇总不同的值 - How to aggregate over only distinct values in SAS Visual Analytics 如何对包含组的计数不同集合的值求和 - How do I Sum the values of a Count Distinct Aggregate of Contained Groups 如何计算汇总查询中前一周的百分比? - How can one calculate percentage from previous week in aggregate query? 如何将与2个不同的值相关的值分组/汇总为一个? - How can I group/aggregate values correlated to 2 distinct values as one? 按聚合函数中的其他列(的第一个值)对不同的列值进行排序 - Ordering distinct column values by (first value of) other column in aggregate function 如何使用具有不同值的COUNT(*)函数? - How to use COUNT(*) function with distinct values? 如何计算 window function 中的不同值 - how to count distinct values in a window function 计算distinct和Null值由聚合消除 - Count distinct and Null value is eliminated by an aggregate
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM