简体   繁体   中英

SQL database index design for inner join keyword search

I have this query

SELECT a.* 
FROM entries a 
INNER JOIN entries_keywords b ON a.id = b.entry_id 
INNER JOIN keywords c ON b.keyword_id = c.id 
WHERE c.key IN ('wake', 'up') 
GROUP BY a.id 
HAVING COUNT(*) = 2

but it's slow. How do I design indexes optimally to speed things up?

EDIT This is the current schema

CREATE TABLE `entries` (`id` integer PRIMARY KEY AUTOINCREMENT, `sha` text);
CREATE TABLE `entries_keywords` (`id` integer PRIMARY KEY AUTOINCREMENT, `entry_id` integer REFERENCES `entries`, `keyword_id` integer REFERENCES `keywords`);
CREATE TABLE `keywords` (`id` integer PRIMARY KEY AUTOINCREMENT, `key` string);
CREATE INDEX `entries_keywords_entry_id_index` ON `entries_keywords` (`entry_id`);
CREATE INDEX `entries_keywords_entry_id_keyword_id_index` ON `entries_keywords` (`entry_id`, `keyword_id`);
CREATE INDEX `entries_keywords_keyword_id_index` ON `entries_keywords` (`keyword_id`);
CREATE INDEX `keywords_key_index` ON `keywords` (`key`);

I'm using Sqlite3, the query doesn't fail, but is slow.

Right now I'm a query like this (subquery for each keyword):

select *
from (
    select *
    from (entries) e
    inner join entries_keywords ek on e.id = ek.entry_id
    inner join keywords k on ek.keyword_id = k.id
    where k.key = 'wake') e
inner join entries_keywords ek on e.id = ek.entry_id
inner join keywords k on ek.keyword_id = k.id
where k.key = 'up';

This is way faster but doesn't feel right since it's going to get ugly if I have a lot of keywords.

The key indexes required for that query

  • keywords(key)
  • entries_keywords(keyword_id,entry_id)
  • entries(id)

You must be using MySQL, because the SELECT a.* would otherwise fail.
EDIT after the 2nd comment about this statement, let me point out why select a.* will fail here - it's because of the GROUP BY .

To explain, because the criteria (WHERE) is on c.key, it needs to be indexed.
This then goes up the JOIN against b.keyword_id. We create an index to include b.entry_id so that it never has to look up against the table - the index alone can cover the columns required.
Finally, a.id=b.entry_id joins back to the entries table, so we index the id of that table.

It is quite likely entries(id) is already the primary key, but you may have entries_keywords indexed the other way around - it won't work to satisfy this join.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM