简体   繁体   English

PostgreSQL查询不在生产中使用索引

[英]PostgreSQL query not using index in production

I'm noticing something strange/weird: 我注意到一些奇怪/奇怪的东西:

The exact same query in development/production are not using the same query path. 开发/生产中完全相同的查询不使用相同的查询路径。 In particular, the development version is using indexes which are omitted in production (in favor of seqscan). 特别是,开发版本使用的是在生产中省略的索引(支持seqscan)。

The only real difference is that the dataset is production is significantly larger--the index size is 1034 MB, vs 29 MB in production. 唯一真正的区别是数据集的生产量要大得多 - 索引大小为1034 MB,而生产中为29 MB。 Would PostgreSQL abstain from using indexes if they (or the table) are too big? 如果他们(或表)太大,PostgreSQL会不会使用索引吗?

EDIT: EXPLAIN ANALYZE for both queries: 编辑:两个查询的EXPLAIN ANALYZE

Development: 发展:

Limit  (cost=41638.15..41638.20 rows=20 width=154) (actual time=159.576..159.581 rows=20 loops=1)
  ->  Sort  (cost=41638.15..41675.10 rows=14779 width=154) (actual time=159.575..159.577 rows=20 loops=1)
        Sort Key: (sum(scenario_ad_group_performances.clicks))
        Sort Method: top-N heapsort  Memory: 35kB
        ->  GroupAggregate  (cost=0.00..41244.89 rows=14779 width=154) (actual time=0.040..151.535 rows=14197 loops=1)
              ->  Nested Loop Left Join  (cost=0.00..31843.75 rows=93800 width=154) (actual time=0.022..82.509 rows=50059 loops=1)
                    ->  Merge Left Join  (cost=0.00..4203.46 rows=14779 width=118) (actual time=0.017..27.103 rows=14197 loops=1)
                          Merge Cond: (scenario_ad_groups.id = scenario_ad_group_vendor_instances.ad_group_id)
                          ->  Index Scan using scenario_ad_groups_pkey on scenario_ad_groups  (cost=0.00..2227.06 rows=14779 width=114) (actual time=0.009..12.085 rows=14197 loops=1)
                                Filter: (scenario_id = 22)
                          ->  Index Scan using index_scenario_ad_group_vendor_instances_on_ad_group_id on scenario_ad_group_vendor_instances  (cost=0.00..1737.02 rows=27447 width=8) (actual time=0.007..7.021 rows=16528 loops=1)
                                Filter: (ad_vendor_id = ANY ('{1,2,3}'::integer[]))
                    ->  Index Scan using index_ad_group_performances_on_vendor_instance_id_and_date on scenario_ad_group_performances  (cost=0.00..1.73 rows=11 width=44) (actual time=0.002..0.003 rows=3 loops=14197)
                          Index Cond: ((vendor_instance_id = scenario_ad_group_vendor_instances.id) AND (date >= '2012-02-01'::date) AND (date <= '2012-02-28'::date))
Total runtime: 159.710 ms

Production: 生产:

Limit  (cost=822401.35..822401.40 rows=20 width=179) (actual time=21279.547..21279.591 rows=20 loops=1)
  ->  Sort  (cost=822401.35..822488.42 rows=34828 width=179) (actual time=21279.543..21279.560 rows=20 loops=1)
        Sort Key: (sum(scenario_ad_group_performances.clicks))
        Sort Method: top-N heapsort  Memory: 33kB
        ->  GroupAggregate  (cost=775502.60..821474.59 rows=34828 width=179) (actual time=19126.783..21226.772 rows=34495 loops=1)
              ->  Sort  (cost=775502.60..776739.48 rows=494751 width=179) (actual time=19125.902..19884.164 rows=675253 loops=1)
                    Sort Key: scenario_ad_groups.id
                    Sort Method: external merge  Disk: 94200kB
                    ->  Hash Right Join  (cost=25743.86..596796.70 rows=494751 width=179) (actual time=1155.491..16720.460 rows=675253 loops=1)
                          Hash Cond: (scenario_ad_group_performances.vendor_instance_id = scenario_ad_group_vendor_instances.id)
                          ->  Seq Scan on scenario_ad_group_performances  (cost=0.00..476354.29 rows=4158678 width=44) (actual time=0.043..8949.640 rows=4307019 loops=1)
                                Filter: ((date >= '2012-02-01'::date) AND (date <= '2012-02-28'::date))
                          ->  Hash  (cost=24047.72..24047.72 rows=51371 width=143) (actual time=1123.896..1123.896 rows=34495 loops=1)
                                Buckets: 1024  Batches: 16  Memory Usage: 392kB
                                ->  Hash Right Join  (cost=6625.90..24047.72 rows=51371 width=143) (actual time=92.257..1070.786 rows=34495 loops=1)
                                      Hash Cond: (scenario_ad_group_vendor_instances.ad_group_id = scenario_ad_groups.id)
                                      ->  Seq Scan on scenario_ad_group_vendor_instances  (cost=0.00..11336.31 rows=317174 width=8) (actual time=0.020..451.496 rows=431770 loops=1)
                                            Filter: (ad_vendor_id = ANY ('{1,2,3}'::integer[]))
                                      ->  Hash  (cost=5475.55..5475.55 rows=34828 width=139) (actual time=88.311..88.311 rows=34495 loops=1)
                                            Buckets: 1024  Batches: 8  Memory Usage: 726kB
                                            ->  Bitmap Heap Scan on scenario_ad_groups  (cost=798.20..5475.55 rows=34828 width=139) (actual time=4.451..44.065 rows=34495 loops=1)
                                                  Recheck Cond: (scenario_id = 276)
                                                  ->  Bitmap Index Scan on index_scenario_ad_groups_on_scenario_id  (cost=0.00..789.49 rows=34828 width=0) (actual time=4.232..4.232 rows=37006 loops=1)
                                                        Index Cond: (scenario_id = 276)
Total runtime: 21306.697 ms

Disclaimer 放弃

I have used PostgreSQL very little. 我很少使用PostgreSQL。 I'm answering based on my knowledge of SQL Server index usage and execution plans. 我正在根据我对SQL Server索引使用和执行计划的了解来回答。 I ask the PostgreSQL gods for mercy if I get something wrong. 如果我弄错了,我会问PostgreSQL众神的怜悯。

Query Optimizers are Dynamic 查询优化器是动态的

You said your query plan has changed from your development to production environments. 您说您的查询计划已从开发环境更改为生产环境。 This is to be expected. 这是可以预料的。 Query optimizers are designed to generate the optimum execution plan based on the current data conditions. 查询优化器旨在根据当前数据条件生成最佳执行计划。 Under different conditions the optimizer may decide it is more efficient to use a table scan vs an index scan. 在不同条件下,优化器可能会认为使用表扫描与索引扫描更有效。

When would it be more efficient to use a table scan vs an index scan? 什么时候使用表扫描与索引扫描更有效?

SELECT A, B
FROM someTable
WHERE A = 'SOME VALUE'

Let's say you have a non-clustered index on column A . 假设您在A列上有一个非聚集索引。 In this case you are filtering on column A , which could potentially take advantage of the index. 在这种情况下,您将对A列进行过滤,这可能会利用索引。 This would be efficient if the index is selective enough - basically, how many distinct values make up the index? 如果索引足够有选择性,那么这将是有效的 - 基本上,有多少不同的值构成索引? The database keeps statistics on this selectivity info and uses these statistics when calculating costs for execution plans. 数据库保留有关此选择性信息的统计信息,并在计算执行计划的成本时使用这些统计信

If you have a million rows in a table, but only 10 possible values for A , then your query would likely return about 100K rows. 如果表中有一百万行,但A只有10个可能的值,那么您的查询可能会返回大约100K行。 Because the index is non-clustered, and you are returning columns not included in the index, B , a lookup will need to be performed for each row returned. 由于索引是非群集的,并且您要返回未包含在索引B中的列,因此需要对返回的每一行执行查找。 These look-ups are random-access lookups which are much more expensive then sequential reads used by a table scan. 这些查找是随机访问查找,比表扫描使用的顺序读取要贵得多。 At a certain point it becomes more efficient for the database to just perform a table scan rather than an index scan. 在某一点上,数据库只执行表扫描而不是索引扫描变得更有效。

This is just one scenario, there are many others. 这只是一种情况,还有很多其他情况。 It's hard to know without knowing more about what your data is like, what your indexes look like and how you are trying to access the data. 如果不了解您的数据是什么样的,索引的样子以及您尝试访问数据的方式,就很难知道。

To answer the original question : 回答原来的问题

Would PostgreSQL abstain from using indexes if they (or the table) are too big? 如果他们(或表)太大,PostgreSQL会不会使用索引吗? No. It is more likely that in the way that you are accessing the data, it is less efficient for PostgreSQL to use the index vs using a table scan. 不可以。在您访问数据的方式中,PostgreSQL使用索引与使用表扫描的效率较低。

The PostgreSQL FAQ touches on this very subject (see: Why are my queries slow? Why don't they use my indexes? ): https://wiki.postgresql.org/wiki/FAQ#Why_are_my_queries_slow.3F_Why_don.27t_they_use_my_indexes.3F PostgreSQL FAQ触及了这个主题(参见: 为什么我的查询速度慢?为什么他们不使用我的索引? ): https//wiki.postgresql.org/wiki/FAQ#Why_are_my_queries_slow.3F_Why_don.27t_they_use_my_indexes.3F

Postgres' query optimizer comes up with multiple scenarios (eg index vs seq-scan) and evaluates them using statistical information about your tables and the relative costs of disk/memory/index/table access set in configuration. Postgres的查询优化器提出了多种方案(例如索引与seq-scan),并使用有关表的统计信息以及配置中设置的磁盘/内存/索引/表访问的相对成本来评估它们。

Did you use the EXPLAIN command to see why index use was omitted? 您是否使用EXPLAIN命令查看为什么索引使用被省略? Did you use EXPLAIN ANALYZE to find out if the decision was in error? 您是否使用EXPLAIN ANALYZE来确定决定是否出错? Can we see the outputs, please? 我们能看到产出吗?

edit: 编辑:

As hard as analyzing two different singular queries on different systems are, I think I see a couple of things. 就像在不同系统上分析两个不同的单一查询一样困难,我想我看到了几件事。

The production environment has a actual/cost rate of around 20-100 milliseconds per cost unit. 生产环境的实际/成本率约为每个成本单位20-100毫秒。 I'm not even a DBA, but this seems consistent. 我甚至不是DBA,但这似乎是一致的。 The development environment has 261 for the main query. 主查询的开发环境有261个。 Does this seem right? 这看起来是对的吗? Would you expect the raw speed (memory/disk/CPU) of the production environment to be 2-10x faster than dev? 您是否期望生产环境的原始速度(内存/磁盘/ CPU)比dev快2-10倍?

Since the production environment has a much more complex query plan, it looks like it's doing its job. 由于生产环境有一个复杂的查询计划,看起来它正在完成它的工作。 Undoubtedly, the dev environment's plan and many more have been considered, and deemed too costly. 毫无疑问,开发环境的计划和更多计划已被考虑,并且被认为成本太高。 And the 20-100 variance isn't that much in my experience (but again, not a DBA) and shows that there isn't anything way off the mark. 根据我的经验,20-100差异并不是那么多(但同样,不是DBA),并且表明没有任何方法可行。 Still, you may want to run a VACUUM on the DB just in case. 不过,您可能希望在数据库上运行VACUUM以防万一。

I'm not experienced and patient enough to decode the full query, but could there be a denormalization/NOSQL-ization point for optimization? 我没有足够的经验和耐心来解码完整的查询,但是可能存在非规范化/ NOSQL化的优化点吗?

The biggest bottleneck seems to be the disk merge at 90 MB. 最大的瓶颈似乎是90 MB的磁盘合并。 If the production environment has enough memory, you may want to increase the relevant setting (working memory?) to do it in-memory. 如果生产环境有足够的内存,您可能需要增加相关设置(工作内存?)以在内存中进行。 It seems to be the work_mem parameter here , though you'll want to read through the rest. 这里似乎是work_mem参数,尽管你想要读完其余部分。

I'd also suggest having a look at the index usage statistics . 我还建议查看 索引使用情况统计信息 Many options with partial and functional indices exist. 存在许多具有部分和功能指数的选项。

Try 尝试

SET enable_seqscan TO 'off'

before EXPLAIN ANALYZE EXPLAIN ANALYZE之前

It seems to me that your dev data is much "simpler" than the production data. 在我看来,您的开发数据比生产数据“更简单”。 As an example: 举个例子:

Development: 发展:

->  Index Scan using index_scenario_ad_group_vendor_instances_on_ad_group_id on scenario_ad_group_vendor_instances  
(cost=0.00..1737.02 rows=27447 width=8) 
(actual time=0.007..7.021 rows=16528 loops=1)
Filter: (ad_vendor_id = ANY ('{1,2,3}'::integer[]))

Production: 生产:

->  Seq Scan on scenario_ad_group_vendor_instances  
(cost=0.00..11336.31 rows=317174 width=8) 
(actual time=0.020..451.496 rows=431770 loops=1)
Filter: (ad_vendor_id = ANY ('{1,2,3}'::integer[]))

This means, that in dev 27447 matching row have been estimated upfront and 16528 rows were indeed found. 这意味着,在dev 27447中,已经预先估计了匹配行,并且确实找到了16528行。 That't the same ballpark and OK. 这不是同一个球场,还可以。

In production 317174 matching rows have been estimated upfront and 431770 rows were found. 在生产中,已经预先估计了317174个匹配行,并且发现了431770行。 Also OK. 还行。

But comparing dev to prod means that the numbers are 10 times different. 但将dev与prod进行比较意味着这些数字是10倍不同。 As already other answers indicate, doing 10 times more random seeks (due to index access) might indeed be worse than a plain table scan. 正如其他答案所表明的那样,进行10次随机搜索(由于索引访问)可能确实比普通表扫描更糟糕。

Hence the interesting question is: How many rows does this table contain both in dev and in prod? 因此,有趣的问题是:这个表在dev和prod中包含了多少行? Is number_returned_rows / number_total_rows comparable between dev and prod? number_returned_rows / number_total_rows和prod之间的number_returned_rows / number_total_rows具有可比性?

Edit Don't forget: I have picked one index access as an example. 编辑不要忘记:我选择了一个索引访问作为示例。 A quick glance shows that the other index accesses have the same symptoms. 快速浏览显示其他索引访问具有相同的症状。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM