简体   繁体   中英

SQL DISTINCT EXISTS GROUP BY aggregate function

Does a relational database exist that has a GROUP BY aggregate function such as DISTINCT EXISTS that returns TRUE if there is more than one distinct value for the group and FALSE otherwise? I am looking for something that would iterate through the values in the group until the current value is not the same as the previous value, instead of counting ALL of the distinct values.

Example:
pv_name | time_stamp | value
A       | 1          | 1
B       | 2          | 1
C       | 3          | 1
A       | 4          | 2
C       | 5          | 2
B       | 6          | 3

SELECT pv_name
FROM example
WHERE time_stamp > 0 AND time_stamp < 6
GROUP BY pv_name
HAVING DISTINCT_EXISTS(value);

Result: A, C
SELECT pv_name
FROM example
WHERE time_stamp > 0 AND time_stamp < 6
GROUP BY pv_name
HAVING MIN(value)<>MAX(value);

Might get you there quicker depending on indexes. I don't think you'll do much better than this or COUNT(DISTINCT value) though.

Have you tried joining to example twice? Psuedo-code example:

with
(
    SELECT pv_name
    FROM example
    WHERE time_stamp > 0 AND time_stamp < 6
) as Q
select distinct Q1.pv_name
from Q as Q1 inner join Q as Q2 on
Q1.pv_name=Q2.pv_name and
Q1.value<>q2.value

You probably know about the COUNT(DISTINCT) function and you want to avoid it to prevent unnecessary computations.

It is hard to know why you are looking for this but I assume that it takes long time to find these groups using the most obvious query:

SELECT type, COUNT(DISTINCT product)
FROM aTable
GROUP BY type
HAVING COUNT(DISTINCT product) > 1

I can recommend you try the window functions. Try for example the new T-SQL's LAST_VALUE and FIRST_VALUE functions:

with c as (
SELECT type
 ,LAST_VALUE(product) OVER (PARTITION BY type ORDER BY product) lv
 ,FIRST_VALUE(product) OVER (PARTITION BY type ORDER BY product) pv
FROM aTable
)
SELECT * from c where lv <> pv

If the DB engine is smart enough it will find the first/last value for the group and will not try to count all the values, and therefore perform better.

For MySQL you can use helper variables to get the row_number per group based on the distinct values, something like this:

SELECT type, product
FROM (
SELECT  @row_num := IF(@prev_type=type and @prev_prod=product,@row_num+1,1) AS RowNumber
       ,type
       ,product
       ,@prev_type := type
       ,@prev_prod := product
  FROM Person,
      (SELECT @row_num := 1) x,
      (SELECT @prev_type := '') y,
      (SELECT @prev_prod := '') z
  ORDER BY type, product
) as a
WHERE RowNumber > 1

I think the having min (value) <> max (value) will be most efficient here. An alternative is:

 Select distinct pv_name
 From example e
 Left join (
     Select value
     From example
     Where ...
     Group by value
     Having count (*) = 1
     ) s on e.value = s.value
 Where s.value is null

Or you could use NOT EXISTS against that subquery instead.

Include the relevant where clause in the sub query.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM