I know I failed to find a proper title.
For the sake of argument i have this table:
sender|receiver
a | b
c | d
d | e
b | a
f | q
q | f
t | u
I want to calculate the number of rows that have a reverse on the table. For example the row a|b has a reverse on the table as b|a. Similarly f|q has a reverse as q|f on the table. So, for this table, i want "2" as answer.
I calculate this as:
CREATE TABLE #temptab
(
sender VARCHAR,
receiver VARCHAR
);
CREATE TABLE #temptab2
(
receiver VARCHAR,
sender VARCHAR
);
INSERT INTO #temptab
(
sender,
receiver
)
SELECT DISTINCT sender,
receiver
FROM table
INSERT INTO #temptab2
(
receiver,
sender
)
SELECT DISTINCT receiver,
sender
FROM table
SELECT COUNT(sender)
FROM (SELECT sender,receiver FROM #temptab INTERSECT SELECT receiver,sender FROM #temptab2
Is there a way that I can do this faster?
I would just do:
select count(*)
from #temptab t
where t.sender < t.receiver and
exists (select 1
from #temptab tt
where tt.sender = t.receiver and tt.receiver = t.sender
);
This should work quite well on Postgres. I'm not sure about the performance on Amazon Redshift.
Another method would use two aggregations:
select count(*)
from (select least(sender, receiver) as x1, greatest(sender_receiver) as x2,
count(distinct sender) as cnt
from #temptab
group by x1, x2
) t
where cnt = 2;
However, your version with intersect
might be faster.
The fastest way usually is using a join
(especially if you have indexes on the two columns):
select count(*)/2
from sr as t1 join sr as t2 on t2.sender=t1.receiver and t2.receiver=t1.sender;
If you have no row with sender=receiver you could also use:
select count(*)
from sr as t1 join sr as t2 on t2.sender=t1.receiver and t2.receiver=t1.sender
where t1.sender < t1.receiver;
In both cases replace sr by the name of your table.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.