简体   繁体   English

SQL DISTINCT EXISTS GROUP BY聚合函数

[英]SQL DISTINCT EXISTS GROUP BY aggregate function

Does a relational database exist that has a GROUP BY aggregate function such as DISTINCT EXISTS that returns TRUE if there is more than one distinct value for the group and FALSE otherwise? 是否存在一个具有GROUP BY聚合函数(例如DISTINCT EXISTS的关系数据库,如果该组有多个不同的值,则返回TRUE,否则返回FALSE? I am looking for something that would iterate through the values in the group until the current value is not the same as the previous value, instead of counting ALL of the distinct values. 我正在寻找可以迭代组中的值直到当前值与先前值不同的东西,而不是对所有不同值进行计数。

Example:
pv_name | time_stamp | value
A       | 1          | 1
B       | 2          | 1
C       | 3          | 1
A       | 4          | 2
C       | 5          | 2
B       | 6          | 3

SELECT pv_name
FROM example
WHERE time_stamp > 0 AND time_stamp < 6
GROUP BY pv_name
HAVING DISTINCT_EXISTS(value);

Result: A, C
SELECT pv_name
FROM example
WHERE time_stamp > 0 AND time_stamp < 6
GROUP BY pv_name
HAVING MIN(value)<>MAX(value);

Might get you there quicker depending on indexes. 可能会根据索引使您更快到达那里。 I don't think you'll do much better than this or COUNT(DISTINCT value) though. 我认为您不会比这或COUNT(DISTINCT value)做得更好。

Have you tried joining to example twice? 您是否尝试过两次加入示例? Psuedo-code example: 伪代码示例:

with
(
    SELECT pv_name
    FROM example
    WHERE time_stamp > 0 AND time_stamp < 6
) as Q
select distinct Q1.pv_name
from Q as Q1 inner join Q as Q2 on
Q1.pv_name=Q2.pv_name and
Q1.value<>q2.value

You probably know about the COUNT(DISTINCT) function and you want to avoid it to prevent unnecessary computations. 您可能知道COUNT(DISTINCT)函数,并且希望避免使用该函数以防止不必要的计算。

It is hard to know why you are looking for this but I assume that it takes long time to find these groups using the most obvious query: 很难知道您为什么要寻找这个,但是我认为使用最明显的查询来找到这些组需要很长时间:

SELECT type, COUNT(DISTINCT product)
FROM aTable
GROUP BY type
HAVING COUNT(DISTINCT product) > 1

I can recommend you try the window functions. 我可以建议您尝试使用窗口功能。 Try for example the new T-SQL's LAST_VALUE and FIRST_VALUE functions: 例如,尝试使用新的T-SQL的LAST_VALUE和FIRST_VALUE函数:

with c as (
SELECT type
 ,LAST_VALUE(product) OVER (PARTITION BY type ORDER BY product) lv
 ,FIRST_VALUE(product) OVER (PARTITION BY type ORDER BY product) pv
FROM aTable
)
SELECT * from c where lv <> pv

If the DB engine is smart enough it will find the first/last value for the group and will not try to count all the values, and therefore perform better. 如果数据库引擎足够智能,它将找到该组的第一个/最后一个值,并且不会尝试对所有值进行计数,因此性能会更好。

For MySQL you can use helper variables to get the row_number per group based on the distinct values, something like this: 对于MySQL,您可以使用帮助程序变量根据不同的值获取每个组的row_number,如下所示:

SELECT type, product
FROM (
SELECT  @row_num := IF(@prev_type=type and @prev_prod=product,@row_num+1,1) AS RowNumber
       ,type
       ,product
       ,@prev_type := type
       ,@prev_prod := product
  FROM Person,
      (SELECT @row_num := 1) x,
      (SELECT @prev_type := '') y,
      (SELECT @prev_prod := '') z
  ORDER BY type, product
) as a
WHERE RowNumber > 1

I think the having min (value) <> max (value) will be most efficient here. 我认为在这里having min (value) <> max (value)将是最有效的。 An alternative is: 一种替代方法是:

 Select distinct pv_name
 From example e
 Left join (
     Select value
     From example
     Where ...
     Group by value
     Having count (*) = 1
     ) s on e.value = s.value
 Where s.value is null

Or you could use NOT EXISTS against that subquery instead. 或者,您也可以针对该子查询使用NOT EXISTS。

Include the relevant where clause in the sub query. 在子查询中包括相关的where子句。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM