简体   繁体   中英

Compare each row in a table to every other row, once and once only

I have a postgres table full of records which I would like to compare to every other record in the same table. Consider this table.

create table temp (rec char(1));

Populated with this data:

insert into temp (rec) values ('a');
insert into temp (rec) values ('b');
insert into temp (rec) values ('c');
insert into temp (rec) values ('d');

And queried with this SQL:

select a.rec, b.rec
from temp a, temp b 

This returns 16 rows (as expected) but it includes extra rows where 'a' = 'a' and 'b' = 'b'. To remove these I can change the query to.

select a.rec, b.rec
from temp a, temp b 
where a.rec <> b.rec

This cuts the results down to 12 rows:

"a" - "b"
"a" - "c"
"a" - "d"
"b" - "a"
"b" - "c"
"b" - "d"
"c" - "a"
"c" - "b"
"c" - "d"
"d" - "a"
"d" - "b"
"d" - "c"

However, it still pulls back some duplicates, not only does it pull back a record where 'a' = 'b' but it also pulls back 'b' = 'a' which I have no interest in (I've already compared record 'a' with record 'b' so have no interest in doing the same comparison the other way around)

So, how can I alter the query to compare every row with every other row once and once only?

Use distinct with your last query:

select distinct least(a.rec, b.rec), greatest(a.rec, b.rec)
from temp a
  cross join temp b 
where a.rec <> b.rec;

The expression least(a.rec, b.rec), greatest(a.rec, b.rec) turns b,a into a,b and the distinct then removes the duplicates.

Online example: https://rextester.com/TDZQ5153

Simply change your condition to < :

select a.rec, b.rec
from temp a join
     temp b 
     on a.rec < b.rec;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM