简体   繁体   English

通过索引数据优化Postgres查询

[英]Optimize Postgres query by indexing data

My existing application is running on Heroku and I'm using Postgres as database. 我现有的应用程序正在Heroku上运行,并且我使用Postgres作为数据库。

Now my queries are getting slow because of the growing amount of data. 现在,由于数据量的增长,我的查询越来越慢。 Here is my query 这是我的查询

SELECT *
FROM my_table
WHERE my_table.is_deleted = $1
  AND my_table.id NOT IN (SELECT my_table_user_actions.qurb_id AS my_table_user_actions_qurb_id
                          FROM my_table_user_actions
                          WHERE my_table_user_actions.user_id = $2
                            AND my_table_user_actions.is_hidden = $3)
  AND my_table.block_x BETWEEN $4 AND $5
  AND my_table.block_y BETWEEN $6 AND $7
  AND my_table.id NOT IN (SELECT sponsored_qurb_log.qurb_id AS sponsored_qurb_log_qurb_id
                          FROM sponsored_qurb_log
                          WHERE sponsored_qurb_log.qurb_id = my_table.id
                            AND sponsored_qurb_log.hash = $8
                            AND sponsored_qurb_log.user_id = $9)) AS anon_1

This query is taking almost 10 seconds to execute on the server. 在服务器上执行此查询大约需要10秒钟。

Now I'm willing to apply an index on following columns 现在,我愿意在以下列上应用索引

  • is_deleted is of type boolean is_deletedboolean类型
  • block_x is of type int block_x的类型为int
  • block_y is of type int block_y的类型为int

These are the three columns. 这是三列。 Here is_deleted is always set to false because I always wanted to get all those records which are not deleted. 这里is_deleted始终设置为false因为我一直想获取所有未删除的记录。 block_x and block_y are the columns which have latitude and longitude. block_xblock_y是具有纬度和经度的列。

Please let me know what will be the index for the query. 请让我知道查询的索引是什么。

Here is what I'm thinking about 这是我在想的

Multi column index : 多栏索引:

CREATE INDEX my_table_xandy_block ON my_table(blovk_x, block_y);

And Partial index for is_deleted: 和is_deleted的部分索引:

CREATE INDEX is_deleted_index ON my_table(is_deleted) WHERE is_deleted IS FALSE;

Kindly check my queries and let me know what should I do to optimize my query. 请检查我的查询,让我知道我应该怎么做来优化查询。 Since I'm not willing to change my query as I'll to deploy newer version of code. 由于我不愿更改查询,因此我将部署较新版本的代码。

In general, you have to examine the EXPLAIN (ANALYZE, BUFFERS) output for the query to answer such a question. 通常,您必须检查查询的EXPLAIN (ANALYZE, BUFFERS)输出才能回答此类问题。

But in your case it is simple: You will have to convert the NOT IN clauses into WHERE NOT EXISTS . 但是对于您而言,这很简单:您将必须将NOT IN子句转换为WHERE NOT EXISTS

An example: 一个例子:

WHERE a.x NOT IN (
   SELECT b.y FROM b
)

should become 应该成为

WHERE NOT EXISTS (
   SELECT 1 FROM b
   WHERE a.x = b.y
)

That way PostgreSQL can use an “antijoin” to process the query, which will be faster for bigger tables. 这样,PostgreSQL可以使用“ antijoin”来处理查询,这对于较大的表将更快。

To speed up the query further, look ad the execution plan and add indexes as appropriate. 为了进一步加快查询速度,请查看执行计划并适当添加索引。

If you really refuse to rewrite the query, the best you can dobare the following indexes: 如果您确实拒绝重写查询,则最好使用以下索引:

CREATE INDEX ON my_table_user_actions (user_id, is_hidden);

CREATE INDEX ON sponsored_qurb_log (qurb_id, hash, user_id);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM