I have a problem regarding an SQL query. I need all rows from two independent tables those do not have a row joining them in a third table. The query is working this way but it has a very bad performance.
now my query look like this:
SELECT s.id,
u.id
FROM table1 s,
table2 u
WHERE NOT EXISTS
(
SELECT *
FROM table3 sj
WHERE sj.s_id=s.id
AND sj.u_id=u.id
)
Keys on table3 are:
ALTER TABLE `table3`
ADD PRIMARY KEY (`id`),
ADD KEY `s_id` (`s_id`),
ADD KEY `u_id` (`u_id`);
table1 has 4 rows, table2 has 80.000 rows, table3 has 30.000 rows
Any ideas how to optimise it? Now the query takes up to 20 minutes to give results.
Edit: Regarding the 20 minutes -> i forgot to set a key on the table3(u_id)
After setting the key it required just some seconds. Great.
Your query seems to me like the right way to do what you want. I would just rewrite the old-school implicit join to an explicit cross join
(but that's semantically équivalent).
For performance, you need an index on table3(s_id, u_id)
.
However, you need to keep in mind that cross joining the tables generates a derived tables of about 2.4 billions rows, so there is still lot of work to do for the database in the not exists
condition.
If sid
and uid
are not unique in the source table, then you can deduplicate before cross joining:
select ...
from (select distinct id from table1) s
cross join (select distinct id from table3) u
where not exists (...)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.