简体   繁体   中英

SQL - extracting all pairs only once

Problem statement:

  • a table with N columns, K out of which are used in a criterion to determine pairs of rows
  • such a criterion involving the K columns can simply be if columns c_1, c2, .. c_k are equal for the two different rows part of a pair (the criterion itself is not relevant, only the fact that it must be used)
  • the requirement is to extract all potential pairs, but only once. This means that if for a row there are more than 2 potential other rows that can form a pair given the above criterion, then only one pair must be extracted

Simple example:

Input table:

A | B | C
x | y | z
w | y | z
u | y | z
u | v | z
v | v | z

Criterion: B and C columns must be the same for two rows to be part of a pair.

Output:

x | y | z
w | y | z
u | v | z
v | v | z

What hints do you have for solving the problem in pure SQL (or in the Oracle dialect, if specific features help)?

If you can use window analytic function:

CREATE TABLE TT1 (A VARCHAR(4), B VARCHAR(4), C VARCHAR(4))
INSERT INTO TT1 VALUES ('x','y','z')
INSERT INTO TT1 VALUES ('w','y','z')
INSERT INTO TT1 VALUES ('u','y','z')
INSERT INTO TT1 VALUES ('u','v','z')
INSERT INTO TT1 VALUES ('v','v','z')
INSERT INTO TT1 VALUES ('k','w','z')


SELECT A.A, A.B, A.C 
FROM
(SELECT *, ROW_NUMBER() OVER (PARTITION BY B,C ORDER BY A DESC) RN, COUNT(*) OVER (PARTITION BY B,C ) RC
FROM TT1) A
WHERE A.RN <=2 AND RC>1

Output:

A    B    C
---- ---- ----
v    v    z
u    v    z
x    y    z
w    y    z

Use the COUNT() analytic function partitioning on those rows you want to match as pairs:

SELECT A, B, C
FROM   (
  SELECT t.*,
         COUNT(*) OVER (
             PARTITION BY B, C
             ORDER BY A
             ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
         ) AS current_rn,
         COUNT(*) OVER (
             PARTITION BY B, C
             ORDER BY A
             ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING
         ) AS next_rn
  FROM   table_name t
)
WHERE  MOD( current_rn, 2 ) = 0
OR     MOD( next_rn, 2 ) = 0;

Output :

A B C
- - -
u y z
w y z
u v z
v v z

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM