简体   繁体   English

为什么此查询不使用我创建的索引?

[英]Why doesn't this query use the indexes i've created?

Running postgres 9.5 运行postgres 9.5

SELECT events.id, events.start_time, events.host_id, events.title from events
JOIN accountsevents ON accountsevents.events_id = events.id
WHERE accountsevents.accounts_id = %(account_id)s OR events.host_id = %(account_id)s
GROUP BY events.id
ORDER BY start_time DESC

I have this query and postgres says the cost is over 100k. 我有这个查询,postgres说费用超过10万。 Seems excessive. 似乎过多。 This is the only query I have that isn't utilizing the indexes i've created for each table. 这是我唯一没有使用为每个表创建的索引的查询。

Indexes:
    "events_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "events_host_id_fkey" FOREIGN KEY (host_id) REFERENCES accounts(id)
Referenced by:
    TABLE "accountsevents" CONSTRAINT "accountsevents_events_id_fkey" FOREIGN KEY (events_id) REFERENCES events(id)
    TABLE "eventsinterests" CONSTRAINT "eventsinterests_events_id_fkey" FOREIGN KEY (events_id) REFERENCES events(id)


Indexes:
    "accountsevents_pkey" PRIMARY KEY, btree (id, accounts_id, events_id)
Foreign-key constraints:
    "accountsevents_accounts_id_fkey" FOREIGN KEY (accounts_id) REFERENCES accounts(id)
    "accountsevents_events_id_fkey" FOREIGN KEY (events_id) REFERENCES events(id)

I feel like the index is setup wrong or I'm just missing something with the query. 我觉得索引设置错误,或者查询中缺少内容。 The initial sequence scans are killing it. 最初的序列扫描正在杀死它。

ADDED EXPLAIN VERBOSE 添加详细说明

"Sort  (cost=124388.27..124390.10 rows=732 width=40) (actual time=1533.902..1533.928 rows=470 loops=1)"
"  Output: events.id, events.start_time, events.host_id, events.title"
"  Sort Key: events.start_time DESC"
"  Sort Method: quicksort  Memory: 66kB"
"  ->  HashAggregate  (cost=124346.12..124353.44 rows=732 width=40) (actual time=1533.658..1533.759 rows=470 loops=1)"
"        Output: events.id, events.start_time, events.host_id, events.title"
"        Group Key: events.id"
"        ->  Hash Join  (cost=4912.30..124344.29 rows=732 width=40) (actual time=56.671..1532.831 rows=971 loops=1)"
"              Output: events.id, events.start_time, events.host_id, events.title"
"              Hash Cond: (accountsevents.events_id = events.id)"
"              Join Filter: ((accountsevents.accounts_id = 1) OR (events.host_id = 1))"
"              Rows Removed by Join Filter: 2761882"
"              ->  Seq Scan on public.accountsevents  (cost=0.00..45228.52 rows=2762852 width=8) (actual time=0.005..466.094 rows=2762853 loops=1)"
"                    Output: accountsevents.events_id, accountsevents.accounts_id"
"              ->  Hash  (cost=2795.91..2795.91 rows=104191 width=40) (actual time=53.579..53.579 rows=104181 loops=1)"
"                    Output: events.id, events.start_time, events.host_id, events.title"
"                    Buckets: 65536  Batches: 4  Memory Usage: 2462kB"
"                    ->  Seq Scan on public.events  (cost=0.00..2795.91 rows=104191 width=40) (actual time=0.004..26.171 rows=104181 loops=1)"
"                          Output: events.id, events.start_time, events.host_id, events.title"
"Planning time: 0.201 ms"
"Execution time: 1534.024 ms"

No index will help you with this query. 没有索引可以帮助您进行此查询。

The problem is that you have an OR in the WHERE condition, so it is not possible to apply a filter before the tables are joined, which is where an index could help you. 问题是您在WHERE条件下具有OR ,因此无法连接表之前应用过滤器,这是索引可以为您提供帮助的地方。 Try to replace the OR with an AND and see how PostgreSQL can do much better. 尝试用AND替换OR AND然后看看PostgreSQL如何做得更好。

This way PostgreSQL has to compute the whole join and can only filter out rows afterwards – see the great number of Rows Removed by Join Filter . 这样,PostgreSQL必须计算整个联接,并且之后只能过滤掉行–请参阅Join Filter删除的大量

The only thing an index could be used for would be a nested loop join, and that would be much more expensive. 索引唯一可用于的是嵌套循环连接,这将花费更多。 So I don't think that there is a better plan for this query. 因此,我认为此查询没有更好的计划。

You can see that PostgreSQL's estimates of the number of rows is pretty good, which is normally a sign that PostgreSQL indeed did the right thing. 您可以看到PostgreSQL对行数的估计是相当不错的,通常这表明PostgreSQL确实做了正确的事。

Maybe you can do better with a query like 也许您可以通过以下查询来做得更好

SELECT * FROM
   (SELECT ... FROM events JOIN accountsevents ON ...
       WHERE accountsevents.accounts_id = 1
    UNION
    SELECT ... FROM events JOIN accountsevents ON ...
       WHERE events.host_id = 1) sub
GROUP BY ... ORDER BY ...

but I wouldn't bet on it. 但我不会打赌。
(Note: this query is semantically slightly different, but maybe it doesn't matter in your case.) (请注意:此查询在语义上略有不同,但对于您的情况可能并不重要。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM