简体   繁体   English

SQL-仅提取一次所有对

[英]SQL - extracting all pairs only once

Problem statement: 问题陈述:

  • a table with N columns, K out of which are used in a criterion to determine pairs of rows 具有N列的表格,其中K列用于确定行对
  • such a criterion involving the K columns can simply be if columns c_1, c2, .. c_k are equal for the two different rows part of a pair (the criterion itself is not relevant, only the fact that it must be used) 这样的涉及K列的标准可以简单地是:如果对一对中两个不同行的列c_1,c2,.. c_k相等(该标准本身不相关,仅是必须使用它的事实)
  • the requirement is to extract all potential pairs, but only once. 要求是提取所有可能的对,但只能提取一次。 This means that if for a row there are more than 2 potential other rows that can form a pair given the above criterion, then only one pair must be extracted 这意味着如果给定上述条件,如果一行中有超过2个可能形成一对的其他潜在行,则必须仅提取一对

Simple example: 简单的例子:

Input table: 输入表:

A | B | C
x | y | z
w | y | z
u | y | z
u | v | z
v | v | z

Criterion: B and C columns must be the same for two rows to be part of a pair. 标准:B和C列必须相同,两行才能成对出现。

Output: 输出:

x | y | z
w | y | z
u | v | z
v | v | z

What hints do you have for solving the problem in pure SQL (or in the Oracle dialect, if specific features help)? 对于纯SQL(或Oracle方言,如果特定功能有帮助)解决问题,您有什么提示?

If you can use window analytic function: 如果可以使用窗口分析功能:

CREATE TABLE TT1 (A VARCHAR(4), B VARCHAR(4), C VARCHAR(4))
INSERT INTO TT1 VALUES ('x','y','z')
INSERT INTO TT1 VALUES ('w','y','z')
INSERT INTO TT1 VALUES ('u','y','z')
INSERT INTO TT1 VALUES ('u','v','z')
INSERT INTO TT1 VALUES ('v','v','z')
INSERT INTO TT1 VALUES ('k','w','z')


SELECT A.A, A.B, A.C 
FROM
(SELECT *, ROW_NUMBER() OVER (PARTITION BY B,C ORDER BY A DESC) RN, COUNT(*) OVER (PARTITION BY B,C ) RC
FROM TT1) A
WHERE A.RN <=2 AND RC>1

Output: 输出:

A    B    C
---- ---- ----
v    v    z
u    v    z
x    y    z
w    y    z

Use the COUNT() analytic function partitioning on those rows you want to match as pairs: 在要成对匹配的那些行上使用COUNT()分析函数分区:

SELECT A, B, C
FROM   (
  SELECT t.*,
         COUNT(*) OVER (
             PARTITION BY B, C
             ORDER BY A
             ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
         ) AS current_rn,
         COUNT(*) OVER (
             PARTITION BY B, C
             ORDER BY A
             ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING
         ) AS next_rn
  FROM   table_name t
)
WHERE  MOD( current_rn, 2 ) = 0
OR     MOD( next_rn, 2 ) = 0;

Output : 输出

A B C
- - -
u y z
w y z
u v z
v v z

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM