簡體   English   中英

PostgreSQL邏輯復制應使用多少CPU /帶寬?

[英]How much CPU / bandwidth should PostgreSQL logical replication be using?

我按照以下說明在PostgreSQL 11上設置了邏輯復制: https : //www.digitalocean.com/community/tutorials/how-to-set-up-logical-replication-with-postgresql-10-on-ubuntu-18 -04

一切工作正常,經過測試,更改得以復制。

但是,一個月后...更改似乎沒有被復制,並且Postgres似乎正在使用大量CPU和帶寬。

  • 2vCPU / 4GB DigitalOcean服務器上的平均負載約為2.5。
  • 帶寬約為1MB / s。
  • 目前,該服務器和數據庫上的活動基本上為零。

這引起了一些問題,例如:

  1. 對於具有邏輯流復制的非活動數據庫使用這么多的資源,這是否正常?
  2. 關於為何復制似乎已停止的任何想法? (更改主服務器上的記錄不再影響副本)
  3. 是否有一些監視和查看復制狀態的專業提示?

Postgres主服務器日志中包含以下各種消息:

2019-04-22 06:26:16.986 UTC [20371] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC21198
2019-04-22 06:26:16.986 UTC [20371] replica_user@server_prod DETAIL:  There are no running transactions.
2019-04-22 06:26:17.010 UTC [20372] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC211D0
2019-04-22 06:26:17.010 UTC [20372] replica_user@server_prod DETAIL:  There are no running transactions.
2019-04-22 06:26:17.055 UTC [20373] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC21208
2019-04-22 06:26:17.055 UTC [20373] replica_user@server_prod DETAIL:  There are no running transactions.
2019-04-22 06:26:17.078 UTC [20374] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC21240
2019-04-22 06:26:17.078 UTC [20374] replica_user@server_prod DETAIL:  There are no running transactions.
2019-04-22 06:26:17.114 UTC [20375] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC21278
2019-04-22 06:26:17.114 UTC [20375] replica_user@server_prod DETAIL:  There are no running transactions.
2019-04-22 06:26:17.154 UTC [20376] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC212B0
2019-04-22 06:26:17.154 UTC [20376] replica_user@server_prod DETAIL:  There are no running transactions.
2019-04-22 06:26:17.186 UTC [20377] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC212E8
2019-04-22 06:26:17.186 UTC [20377] replica_user@server_prod DETAIL:  There are no running transactions.
2019-04-22 06:26:17.229 UTC [20378] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC21320
2019-04-22 06:26:17.229 UTC [20378] replica_user@server_prod DETAIL:  There are no running transactions.
2019-04-22 06:26:17.235 UTC [20378] replica_user@server_prod LOG:  could not send data to client: Connection reset by peer
2019-04-22 06:26:17.235 UTC [20378] replica_user@server_prod STATEMENT:  COPY public.class_registrations TO STDOUT
2019-04-22 06:26:17.235 UTC [20378] replica_user@server_prod FATAL:  connection to client lost
2019-04-22 06:26:17.235 UTC [20378] replica_user@server_prod STATEMENT:  COPY public.class_registrations TO STDOUT
2019-04-22 06:26:17.259 UTC [20379] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC21358
2019-04-22 06:26:17.259 UTC [20379] replica_user@server_prod DETAIL:  There are no running transactions.
2019-04-22 06:26:21.327 UTC [20418] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC21390
2019-04-22 06:26:21.327 UTC [20418] replica_user@server_prod DETAIL:  There are no running transactions.
2019-04-22 06:26:21.341 UTC [20419] replica_user@server_prod LOG:  logical decoding found consistent point at 0/1EC213C8
2019-04-22 06:26:21.341 UTC [20419] replica_user@server_prod DETAIL:  There are no running transactions.

副本服務器中充滿了以下幾種消息:

2019-04-21 06:26:07.619 UTC [2967] LOG:  logical replication table synchronization worker for subscription "replica_subscription", table "messages" has started
2019-04-21 06:26:07.645 UTC [2966] ERROR:  duplicate key value violates unique constraint "account_locations_pkey"
2019-04-21 06:26:07.645 UTC [2966] DETAIL:  Key (id)=(1) already exists.
2019-04-21 06:26:07.645 UTC [2966] CONTEXT:  COPY account_locations, line 1
2019-04-21 06:26:07.648 UTC [16353] LOG:  background worker "logical replication worker" (PID 2966) exited with exit code 1
2019-04-21 06:26:07.652 UTC [2968] LOG:  logical replication table synchronization worker for subscription "replica_subscription", table "user_photos" has started
2019-04-21 06:26:07.663 UTC [2967] ERROR:  duplicate key value violates unique constraint "messages_pkey"
2019-04-21 06:26:07.663 UTC [2967] DETAIL:  Key (id)=(1) already exists.
2019-04-21 06:26:07.663 UTC [2967] CONTEXT:  COPY messages, line 1

這是最近6個小時的平均負載(您可以看到我何時刪除副本服務器上的訂閱服務器)。

在此處輸入圖片說明

這是帶寬:

在此處輸入圖片說明

這里也是一個iftop僅僅〜10-15秒的監測結果:

在此處輸入圖片說明

在根據Laurenz的建議檢查日志后,看來我的初始數據加載的主ID上的序列對所有表而言都不正確。 (不確定如何發生)

要解決復制問題,我執行了以下操作:

  1. 從副本服務器刪除訂閱
  2. 刪除所有表格
  3. 重新加載所有表-僅架構(無數據)
  4. 再次創建訂閱

這導致所有數據被同步,並且一切又恢復正常。 我通過更新數據並在副本服務器中看到更新來確認。

出現復制錯誤時,CPU負載和帶寬似乎很高,Postgres會盡可能地反復嘗試。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM