简体   繁体   English

PostgreSQL复杂索引类型和排序

[英]PostgreSQL Complex indexing types and ordering

I have been doing some heavy reading the last couple days on indexing and I'm trying to figure out the right way to index a query I have with a lot of constraints. 最近几天,我一直在做大量有关索引的阅读,我试图找出正确的方法来对有很多约束的查询建立索引。 I am using the postgres_ext gem to support array datatypes and GIN and GIST index types. 我正在使用postgres_ext gem支持数组数据类型以及GIN和GIST索引类型。

I have a two queries 我有两个疑问

.where("a_id IN (?) and b = ? and active = ? and ? != ALL(c) and ? = ANY(d)")
.where("a_id =? and active =? and ? != ALL(c)")

c and d are integer arrays c和d是整数数组

The indexes I plan on adding: 我计划添加的索引:

 add_index :deals, [:a, :b], :where => "active = true"
 add_index :deals [:c, :d], :index_type => :gin, :where => "active = true"

Will postgres use both of these multicolumn indexes in the first query? postgres会在第一个查询中同时使用这两个多列索引吗?

Should array datatypes always be in "gin" index types? 数组数据类型应始终为“ gin”索引类型吗? or can you also put them in a b-tree index? 还是可以将它们放入b树索引中?

and finally will the first index be used for 'a' in both of the queries? 最终两个查询中的第一个索引都将用于“ a”吗?

Additional Information: 附加信息:

I am using PostgreSQL 9.1.3 我正在使用PostgreSQL 9.1.3

create_table "table", :force => true do |t|
 t.integer  "a_id"    ##foreign key
 t.string   "title"
 t.text     "description",    :default => ""
 t.boolean  "active",           :default => true
 t.datetime "created_at",      :null => false
 t.datetime "updated_at",    :null => false
 t.integer  "b",
 t.integer  "c", :limit => 8,   :array => true
 t.integer  "d",  :array => true
end

Regarding arrays and GIN, you can have a b-tree index of arrays but it isn't useful for operations like "array contains element". 关于数组和GIN,您可以拥有数组的b树索引,但是对于“数组包含元素”之类的操作没有用。 You need GIN or GiST for that and only GIN is supported as a built-in index for all array types. 为此,您需要GIN或GiST,并且仅GIN作为所有数组类型的内置索引受支持。

You can also use the intarray extension and its GiST index type for integer arrays that will perform better under write load but worse under read load. 您还可以将intarray扩展及其GiST索引类型用于整数数组,该整数数组在写入负载下性能更好,而在读取负载下性能更差。

As for determining whether Pg will use both indexes, the best way to tell is to use EXPLAIN ANALYZE and see. 至于确定Pg是否将同时使用两个索引,最好的判断方法是使用EXPLAIN ANALYZE进行查看。 Get the statement Rails executes from the PostgreSQL log by enabling log_statement or from the Rails logs with SQL logging on. 通过启用log_statement或从启用SQL登录的Rails日志中获取Rails从PostgreSQL日志中执行的语句。 Then run it in psql with explain analyze . 然后在psql运行,并带有explain analyze Alternately, use the auto_explain extension to capture performance reports on the queries as they run. 或者,使用auto_explain扩展名捕获查询运行时的性能报告。

I have the feeling that you'll find that Pg can't combine a GiST or GIN and a b-tree index in the same filter. 我有种感觉,您会发现Pg不能在同一过滤器中组合GiST或GIN和b树索引。 Combining indexes requires a bitmap index scan and that's IIRC only available for two b-tree indexes. 合并索引需要进行位图索引扫描,并且IIRC仅可用于两个b树索引。 You'd probably need to add the extra columns to the GiST or GIN index, but that'll increase the index size quite dramatically and may not be worth it. 您可能需要在GiST或GIN索引中添加额外的列,但这会大大增加索引的大小,可能不值得。

You really need to use explain analyze to see how it works in the real world on sample or production data. 您确实需要使用explain analyze来查看它在样品或生产数据上在现实世界中如何工作。

When working with multicolumn indexes, keep in mind that at least for b-tree indexes Pg can use an index on (a,b) for queries that filter on a or on both a and b , but not for queries that filter only on b . 使用多列索引时,请记住,至少对于b树索引,Pg可以对(a,b)上的索引使用基于a或同时对ab进行过滤的查询,但不能对仅对b过滤的查询使用索引。 Indexes are usable left-to-right, you can't use an index to search for a value on the right side of the index unless you're also searching on all values to the left of it. 索引从左到右可用,除非您也在索引左侧搜索所有值,否则不能使用索引在索引右侧搜索值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM