簡體   English   中英

大刪除后的下一個查詢很慢。 為什么 SELECT 會觸發 WAL 文件歸檔?

[英]Next query after large delete is slow. Why does a SELECT trigger WAL file archiving?

請問這是怎么回事? 具體來說

  1. 為什么是SELECT,DELETE后這么慢。 我知道它現在必須導航死元組,但成本高得多嗎?
  2. 為什么 SELECT 似乎會導致日志文件歸檔? 意外,因為 SELECT 不應該生成任何 WAL?
  3. 似乎 SELECT 在返回答案之前等待日志文件歸檔。 為什么?

環境:

  • PostgreSQL v 14.2
  • 日志級別=調試1。 (所以我可以看到存檔的 WAL 活動)。

我有一個流式副本和一個 WAL 存檔目錄。

我創建了一個包含 5000 萬行的簡單表。

CREATE TABLE IF NOT EXISTS bigt
(
    a integer,
    b integer,
    des text
);

現在用一些數據填充表(5000 萬行就足夠了):

insert into bigt select i,mod(i,4),md5( mod(i,4)::text) from generate_series(1,50000000) i;

讓我們看看查詢它需要多長時間(掃描它)

select count(*) from bigt;
  count
----------
 50000000
(1 row)

Time: 1459.486 ms (00:01.459)

好的,因此不到 1.5 秒(請注意,由於緩存,在第二次和第二次查詢中這當然會更快)。

現在我將刪除該表中的一半行,然后重新查詢表數並觀察會發生什么:

delete from bigt where a>25000000;
DELETE 25000000
Time: 41669.131 ms (00:41.669)
mydb=# select count(*) from bigt;
  count
----------
 25000000
(1 row)

Time: 22453.483 ms (00:22.453)

哇。 22.5 秒,我同時跟蹤 PostgreSQL 日志文件。 在 DELETE 語句之后,我在運行 SELECT 之前等待 10 秒。 運行 SELECT 的行為似乎會導致一系列日志行,如

DEBUG:  archived write-ahead log file "0000000100000023000000F6"

在這些行的 10 秒之后,SELECT 在 22.5 秒后完成!

更新所以只是懷疑 SELECT 觸發了某些東西,我簡化了場景並且(消除了在測試期間啟動 autovacuum 的可能性)我為此測試表禁用了 autovacuum。

空表(截斷)。 與以前相同的架構。

mydb=# ALTER TABLE bigt SET (autovacuum_enabled = off);
ALTER TABLE
Time: 2.436 ms
mydb=# truncate bigt;
TRUNCATE TABLE
Time: 196.077 ms

插入 3000 萬行

mydb=# insert into bigt select i,mod(i,4),md5( mod(i,4)::text) from generate_series(1,30000000) i;
INSERT 0 30000000
Time: 57994.379 ms (00:57.994)

等待幾秒鍾,然后發出 SELECT (我正在跟蹤日志,這個 SELECT 似乎導致歸檔活動)

這一次,我使用我通常使用的解釋分析 output 運行它

db=# explain (analyze,buffers,verbose) select count(*) from bigt;
                                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=384761.97..384761.98 rows=1 width=8) (actual time=22437.498..22463.793 rows=1 loops=1)
   Output: count(*)
   Buffers: shared read=280374 dirtied=280374 written=94611
   I/O Timings: read=59174.975 write=1503.423
   ->  Gather  (cost=384761.91..384761.96 rows=4 width=8) (actual time=22437.135..22463.785 rows=5 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 4
         Workers Launched: 4
         Buffers: shared read=280374 dirtied=280374 written=94611
         I/O Timings: read=59174.975 write=1503.423
         ->  Partial Aggregate  (cost=383761.91..383761.92 rows=1 width=8) (actual time=22410.083..22410.083 rows=1 loops=5)
               Output: PARTIAL count(*)
               Buffers: shared read=280374 dirtied=280374 written=94611
               I/O Timings: read=59174.975 write=1503.423
               Worker 0:  actual time=22403.414..22403.415 rows=1 loops=1
                 Buffers: shared read=55891 dirtied=55891 written=18963
                 I/O Timings: read=11813.538 write=323.569
               Worker 1:  actual time=22403.428..22403.429 rows=1 loops=1
                 Buffers: shared read=55196 dirtied=55196 written=18936
                 I/O Timings: read=11892.729 write=322.704
               Worker 2:  actual time=22403.400..22403.401 rows=1 loops=1
                 Buffers: shared read=55584 dirtied=55584 written=18621
                 I/O Timings: read=11842.255 write=317.909
               Worker 3:  actual time=22403.248..22403.249 rows=1 loops=1
                 Buffers: shared read=55424 dirtied=55424 written=18837
                 I/O Timings: read=11814.539 write=288.189
               ->  Parallel Seq Scan on public.bigt  (cost=0.00..363084.33 rows=8271033 width=0) (actual time=0.354..21911.602 rows=6000000 loops=5)
                     Output: a, b, des
                     Buffers: shared read=280374 dirtied=280374 written=94611
                     I/O Timings: read=59174.975 write=1503.423
                     Worker 0:  actual time=0.174..21909.475 rows=5980319 loops=1
                       Buffers: shared read=55891 dirtied=55891 written=18963
                       I/O Timings: read=11813.538 write=323.569
                     Worker 1:  actual time=0.526..21914.253 rows=5905972 loops=1
                       Buffers: shared read=55196 dirtied=55196 written=18936
                       I/O Timings: read=11892.729 write=322.704
                     Worker 2:  actual time=0.519..21908.078 rows=5947488 loops=1
                       Buffers: shared read=55584 dirtied=55584 written=18621
                       I/O Timings: read=11842.255 write=317.909
                     Worker 3:  actual time=0.525..21909.759 rows=5930368 loops=1
                       Buffers: shared read=55424 dirtied=55424 written=18837
                       I/O Timings: read=11814.539 write=288.189
 Query Identifier: -3522295412005428879
 Planning:
   Buffers: shared hit=7 read=4 dirtied=1
   I/O Timings: read=0.025
 Planning Time: 0.312 ms
 Execution Time: 22463.902 ms
(48 rows)

Time: 22465.325 ms (00:22.465)

日志文件提取顯示 SELECT 以及 SELECT 對日志文件歸檔的影響。 “解釋選擇”似乎觸發了許多存檔的 WAL 活動。

2022-08-22 17:13:23.788 BST [26503]  DEBUG:  archived write-ahead log file "00000001000000260000004D"
2022-08-22 17:13:23.997 BST [26503]  DEBUG:  archived write-ahead log file "00000001000000260000004E"
2022-08-22 17:13:24.412 BST [26503]  DEBUG:  archived write-ahead log file "00000001000000260000004F"
2022-08-22 17:13:24.635 BST [26503]  DEBUG:  archived write-ahead log file "000000010000002600000050"
2022-08-22 17:13:25.037 BST [61194] usr LOG:  duration: 57994.310 ms
2022-08-22 17:13:30.715 BST [61194] usr LOG:  statement: explain (analyze,buffers,verbose) select count(*) from bigt;
2022-08-22 17:13:30.716 BST [26496]  DEBUG:  registering background worker "parallel worker for PID 61194"
2022-08-22 17:13:30.716 BST [26496]  DEBUG:  registering background worker "parallel worker for PID 61194"
2022-08-22 17:13:30.716 BST [26496]  DEBUG:  registering background worker "parallel worker for PID 61194"
2022-08-22 17:13:30.716 BST [26496]  DEBUG:  registering background worker "parallel worker for PID 61194"
2022-08-22 17:13:30.716 BST [26496]  DEBUG:  starting background worker process "parallel worker for PID 61194"
2022-08-22 17:13:30.716 BST [26496]  DEBUG:  starting background worker process "parallel worker for PID 61194"
2022-08-22 17:13:30.717 BST [26496]  DEBUG:  starting background worker process "parallel worker for PID 61194"
2022-08-22 17:13:30.717 BST [26496]  DEBUG:  starting background worker process "parallel worker for PID 61194"
2022-08-22 17:13:30.830 BST [26503]  DEBUG:  archived write-ahead log file "000000010000002600000051"
2022-08-22 17:13:30.880 BST [26503]  DEBUG:  archived write-ahead log file "000000010000002600000052"
2022-08-22 17:13:31.067 BST [26503]  DEBUG:  archived write-ahead log file "000000010000002600000053"

我認為這很有啟發性。 所以我然后在桌子上做了一個手動吸塵(雖然這報告刪除了 0 行。我希望我剛剛插入了 3000 萬行)。

然后我重復 EXPLAIN SELECT 並且計划顯示這次沒有緩沖區被“弄臟”。 我想這是最大的線索。

db=# explain (analyze,buffers,verbose) select count(*) from bigt;
                                                                    QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=375124.06..375124.07 rows=1 width=8) (actual time=1030.392..1036.682 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=65150 read=215224
   I/O Timings: read=495.941
   ->  Gather  (cost=375124.00..375124.05 rows=4 width=8) (actual time=1030.270..1036.676 rows=5 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 4
         Workers Launched: 4
         Buffers: shared hit=65150 read=215224
         I/O Timings: read=495.941
         ->  Partial Aggregate  (cost=374124.00..374124.01 rows=1 width=8) (actual time=1010.259..1010.260 rows=1 loops=5)
               Output: PARTIAL count(*)
               Buffers: shared hit=65150 read=215224
               I/O Timings: read=495.941
               Worker 0:  actual time=1005.289..1005.289 rows=1 loops=1
                 Buffers: shared hit=12674 read=42481
                 I/O Timings: read=97.199
               Worker 1:  actual time=1005.322..1005.323 rows=1 loops=1
                 Buffers: shared hit=12833 read=43200
                 I/O Timings: read=99.123
               Worker 2:  actual time=1005.319..1005.320 rows=1 loops=1
                 Buffers: shared hit=12864 read=42989
                 I/O Timings: read=100.243
               Worker 3:  actual time=1005.289..1005.290 rows=1 loops=1
                 Buffers: shared hit=12793 read=43094
                 I/O Timings: read=98.914
               ->  Parallel Seq Scan on public.bigt  (cost=0.00..355374.00 rows=7500000 width=0) (actual time=0.140..613.439 rows=6000000 loops=5)
                     Output: a, b, des
                     Buffers: shared hit=65150 read=215224
                     I/O Timings: read=495.941
                     Worker 0:  actual time=0.204..614.154 rows=5901585 loops=1
                       Buffers: shared hit=12674 read=42481
                       I/O Timings: read=97.199
                     Worker 1:  actual time=0.171..613.932 rows=5995531 loops=1
                       Buffers: shared hit=12833 read=43200
                       I/O Timings: read=99.123
                     Worker 2:  actual time=0.132..613.952 rows=5976271 loops=1
                       Buffers: shared hit=12864 read=42989
                       I/O Timings: read=100.243
                     Worker 3:  actual time=0.179..611.561 rows=5979909 loops=1
                       Buffers: shared hit=12793 read=43094
                       I/O Timings: read=98.914
 Query Identifier: -3522295412005428879
 Planning:
   Buffers: shared hit=1 read=1
   I/O Timings: read=0.663
 Planning Time: 1.231 ms
 Execution Time: 1036.729 ms
(48 rows)

Time: 1040.606 ms (00:01.041)

奇怪的是,查詢仍然會觸發所有這些 WAL 歸檔活動。 它是可靠可重復的。 就是不明白為什么。

SELECT 正在弄臟一些緩沖區,這條線索讓我閱讀了每個行標題中“對所有人可見”位的設置。 因此,我需要 go 閱讀相關內容,因為這聽起來很相關。 謝謝大家的幫助!

非默認設置是: ... wal_log_hints=on,

好吧,這就是為什么要歸檔的答案。 在臟表上執行 SELECT 將設置提示位,並且使用該設置,它會生成 WAL,並且需要存檔 WAL。

SELECT 不會等待存檔發生。 但是通過觸發歸檔,它必須與它競爭資源。 但這可能不是緩慢的主要原因。 即使它沒有生成 WAL,設置提示位仍然會消耗 CPU 和 IO。

有人提議添加一個設置來限制 SELECT 願意設置的提示位數量。 但我認為它從未被 go 接受。 一方面是 SELECT 已經完成了確定元組對其不可見的工作,應該設置該位,以便未來的 SELECT 不必重復該確定。 另一方面,設置提示位最好由 autovacuum 完成(沒有人在等待它),那么為什么 SELECT 只是為了竊取 autovacuum 的部分工作而惹惱它的客戶呢?

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM