简体   繁体   English

查询具有最小不同列值的 select 行

[英]Query to select rows with minimum distinct value of a column

I need to select row with minimum value of column B for each row of column A but it should be distinct from the other values that so far have been selected for column A. So the order of A maters.我需要 select 行,A 列的每一行的 B 列最小值,但它应该与迄今为止为 A 列选择的其他值不同。所以 A 的顺序很重要。 Also if the B is used up and none is left then the later values for A should be NULL or not appearing in the result.此外,如果 B 用完并且没有剩余,则 A 的后面值应该是 NULL 或不会出现在结果中。

Both A and B are numerical (or time stamp). A 和 B 都是数字(或时间戳)。 example:例子:

A   | B | 
----+---+
1   | 3 | 
1   | 5 | 
1   | 6 | 
2   | 3 | 
2   | 5 | 
9   | 3 |
9   | 5 | 

So the desired result is:所以想要的结果是:

A   | B | 
----+---+
1   | 3 | 
2   | 5 | 

select A, min(B) group by A obviously doesn't work because I don't want B to be repeated. select A, min(B) group by A显然不起作用,因为我不希望 B 重复。 Distinct also doesn't work because the rows are already distinct. Distinct也不起作用,因为行已经不同。 I couldn't really find any question similar to this anywhere.我在任何地方都找不到类似的问题。 The actual data I am working with is the database of timeseries on redshift so A and B are timestamps.我正在使用的实际数据是 redshift 上的时间序列数据库,因此 A 和 B 是时间戳。 CTE's would be specifically welcome. CTE 将受到特别欢迎。

First I thought this could be solved with ROW_NUMBER () OVER (ORDER PARTITION BY B DESC) however there is a problem, the numbers in B should not be repeated.首先我认为这可以通过ROW_NUMBER () OVER (ORDER PARTITION BY B DESC)来解决,但是有一个问题,B 中的数字不应该重复。

At the moment the only thing that comes to mind is to make temporary tables, I know this is not the best way, but you can probably improve it目前唯一想到的是制作临时表,我知道这不是最好的方法,但你可能可以改进它

DECLARE @Tabla1 TABLE(A INT) 
DECLARE @Tabla2 TABLE(B INT)
DECLARE @Tabla3 TABLE(A INT, B INT)
INSERT INTO @Tabla1 SELECT DISTINCT A FROM PRUEBA

WHILE (SELECT COUNT(*) FROM @Tabla1) > 0
BEGIN
  DECLARE @A INT, @B INT;
  SET @A = (SELECT TOP 1  * FROM @Tabla1);
  SET @B = (SELECT MIN(B) FROM PRUEBA WHERE A = @A AND B NOT IN(SELECT * FROM @Tabla2));
  INSERT INTO @Tabla2 VALUES (@B)
  DELETE FROM @Tabla1 WHERE A = @A
  INSERT INTO @Tabla3 SELECT A, B FROM PRUEBA WHERE A = @A AND B = @B
END

SELECT * FROM @Tabla3

Maybe you can use a cursor, but you would have to be calculated that takes more computational expense, the cursor or the temporary tables也许您可以使用 cursor,但您必须计算需要更多计算费用,cursor 或临时表

This is basically a "find the diagonal" problem.这基本上是一个“找到对角线”的问题。 You need to know the rank of B within A and the rank of A within all.你需要知道 B 在 A 中的排名以及 A 在 all 中的排名。 I believe this works for the data given:我相信这适用于给定的数据:

select A, B from (
  select row_number() over (partition by A order by B) as RN,
    dense_rank() over (order by A) as DR.
    A, B
    from <table> )
where RN = DR; 

For more complex cases this solution will get more complex.对于更复杂的情况,此解决方案将变得更加复杂。

Addendum: Because I know it will be asked and this is an interesting problem, I worked out what such a more complex solution would look like:附录:因为我知道它会被问到而且这是一个有趣的问题,所以我想出了这样一个更复杂的解决方案会是什么样子:

select min(A) as A, B from (
  select decode(A <> nvl(min(A) over (order by DRB, DRA rows between unbounded preceding and 1 preceding),-1), true, 'good', 'no good') as Y,
    A, B from (
    select dense_rank() over (partition by B order by A) as DRA,
      dense_rank() over ( order by B) as DRB,
      A, B from <table>
  )
  where DRA <= DRB
)
where Y = 'good'
group by B
order by A, B;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM