[英]Count distinct on TWO columns on SQL
Let's consider this example : 让我们考虑这个例子:
Employee Function Start_dept End_dept
A dev 10 13
A dev 11 12
A test 9 9
A dev 13 11
What I want to select is employee, their function and the distinct departments in BOTH "start" and "end" department. 我要选择的是员工,他们的职能以及“开始”和“结束”部门中的不同部门。 It will give this result :
它将得到以下结果:
Employee Function count_distinct_dept
A dev 4
A test 1 `
For the dev A, we have only 4 distinct departments (10, 11, 12 and 13) because we shouldn't count duplicate values in the 2 columns (start and end). 对于开发人员A,我们只有4个不同的部门(10、11、12和13),因为我们不应该在2列(开始和结束)中计算重复的值。
How can I do this ? 我怎样才能做到这一点 ? (I'm using mySQL).
(我正在使用mySQL)。 Is it possible to do this on one request without any JOIN or any UNION ?
是否可以在没有任何JOIN或UNION的情况下按一个请求执行此操作? Or is it obligatory to use one of them ?
还是必须使用其中之一? Since I am using a huge database (with more than 3 billions lines), I am not sure if a join or union request will be optimal...
由于我使用的是庞大的数据库(超过30亿行),因此我不确定联接或联合请求是否是最佳选择...
Use a union all
and aggregation: 使用
union all
和聚合:
select Employee, Function, count(distinct dept)
from ((select Employee, Function, Start_dept as dept
from e
) union all
(select Employee, Function, End_dept
from e
)
) e
group by Employee, Function;
If you want performance, I would suggest starting with two indexes on (Employee, Function, Start_Dept)
and (Employee, Function, End_Dept)
. 如果要提高性能,建议从
(Employee, Function, Start_Dept)
和(Employee, Function, End_Dept)
上的两个索引开始。 Then: 然后:
select Employee, Function, count(distinct dept)
from ((select distinct Employee, Function, Start_dept as dept
from e
) union all
(select distinct Employee, Function, End_dept
from e
)
) e
group by Employee, Function;
The subqueries should be scanning the index rather than the overall table. 子查询应扫描索引而不是整个表。 You will still need to do the final
GROUP BY
. 您仍然需要做最后的
GROUP BY
。 I am guessing that COUNT(DISTINCT)
is a better approach than UNION
in the subquery, but you could test that. 我猜想在子查询中,
COUNT(DISTINCT)
比UNION
更好,但是您可以测试一下。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.