在PostgreSQL中SQL查询间歇性地变慢

Question

When running the following query, it sometimes takes 15 seconds and sometimes 90mins. 运行以下查询时，有时可能需要15秒，有时甚至90分钟。 What causes this big difference? 是什么原因造成这种巨大差异？

INSERT INTO missing_products 
SELECT table_name, 
   product_id 
FROM   products 
WHERE  table_name = 'xxxxxxxxx' 
   AND product_id NOT IN (SELECT id 
                               FROM new_products);

I have tried an explain on it and the only thing I can see is an index only scan on new products. 我已经尝试过对此进行解释，我唯一能看到的是index only scan新产品进行index only scan 。 I did also rewrite out this query to have a left join instead and insert the rows where the right side is NULL but this causes the same problem with the time. 我也确实重写了此查询，使其具有左连接，并在右侧为NULL的行中插入了行，但这会导致相同的时间问题。

I have the following tables with a structure something like what follows. 我的下表结构如下。

products 产品展示

id bigint not null,
product_id text not null,
table_name text not null,
primary key (id),
unique index (product_id)

new_products 新产品

id text not null,
title text not null,
primary key, btree (id)

missing_products missing_products

table_name text not null,
product_id text not null,
primary key (table_name, product_id)

Explain - This has an extra field in the where clause but should give a good idea. 说明 -在where子句中有一个额外的字段，但应该给出一个好主意。 Time it took 22 seconds. 花费了22秒的时间。

 Insert on missing_products  (cost=5184.80..82764.35 rows=207206 width=38) (actual time=22466.525..22466.525 rows=0 loops=1)
   ->  Seq Scan on products  (cost=5184.80..82764.35 rows=207206 width=38) (actual time=0.055..836.217 rows=411150 loops=1)
         Filter: ((active > (-30)) AND (NOT (hashed SubPlan 1)) AND (feed = 'xxxxxxxx'::text))
         Rows Removed by Filter: 77436
         SubPlan 1
           ->  Index Only Scan using new_products_pkey on new_products  (cost=0.39..5184.74 rows=23 width=10) (actual time=0.027..0.027 rows=0 loops=1)
                 Heap Fetches: 0
 Planning time: 0.220 ms
 Execution time: 22466.596 ms

Answer 1

Apparently looking at the output of your EXPLAIN ANALYZE , the SELECT hardly takes 800ms, most of the time, almost 22seconds is spend in INSERTING rows. 显然，望着你的输出EXPLAIN ANALYZE ，在SELECT几乎需要800毫秒，大部分时间里，几乎22秒是花INSERTING行。

Also, it seems that statistics are not accurate for your new_products table, as it predicts 23 rows whereas actual rows are only 0, tough the plan looks correct now, it could be disastrous depending on how new_products table is used throughout your app, I'd first ANALYZE the table on regular intervals if the auto analyze is not kicking in, and monitor the performance over a days' time 另外，看来您new_products表的统计信息不准确，因为它预测23行，而实际行只有0行，现在该计划看起来正确，但根据整个应用程序中使用new_products表的方式，这可能是灾难性的， d如果没有启动自动分析功能，请首先定期ANALYZE表，并在一天的时间内监控性能

Answer 2

I would try 2 things: 我会尝试2件事情：

Try adding an index on products.table_name , which you don't seem to have at the moment. 尝试在products.table_name上添加一个索引，该索引目前似乎还没有。
Try rewriting the query to use a not exists clause instead of not in . 尝试重写查询以使用not exists子句，而不是not in 。 Sometimes, the database can perform the query more efficiently that way: 有时，数据库可以通过以下方式更有效地执行查询：

Query with not exists : 查询not exists ：

INSERT INTO missing_products (table_name, product_id)
SELECT p.table_name, p.product_id 
  FROM products p
 WHERE p.table_name = 'xxxxxxxxx' 
   AND NOT EXISTS (SELECT null
                     FROM new_products n
                    WHERE n.id = p.product_id)

在PostgreSQL中SQL查询间歇性地变慢

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-12-24 14:27:03

解决方案2
0 2015-12-24 14:16:36

在PostgreSQL中SQL查询间歇性地变慢

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-12-24 14:27:03

解决方案2 0 2015-12-24 14:16:36

解决方案1
1 已采纳 2015-12-24 14:27:03

解决方案2
0 2015-12-24 14:16:36