archive_cleanup_command 不清除存档的 wal 文件

Question

Main question:主要问题：
archive_cleanup_command in the postgresql.conf file does not clear the archived wal files. postgresql.conf 文件中的 archive_cleanup_command 不会清除存档的 wal 文件。 How can I get it to clear the archived wal files?我怎样才能清除存档的 wal 文件？

Relevant information:相关信息：

My OS is Linux, Ubuntu v18.04 LTS.我的操作系统是 Linux，Ubuntu v18.04 LTS。
Database is Postgresql version 13数据库是 Postgresql 版本 13

My current settings:我目前的设置：
/etc/postgresql/13/main/postgresql.conf file: /etc/postgresql/13/main/postgresql.conf 文件：

wal_level = replica
wal_compression = on

wal_recycle = on
checkpoint_timeout = 5min

max_wal_size = 1GB
min_wal_size = 80MB

archive_mode = on
archive_command = 'pxz --compress --keep --force -6 --to-stdout --quiet %p > /datadrive/postgresql/13/wal_aerchives/%f.xz'

archive_timeout = 10min

restore_command = 'pxz --decompress --keep --force -6 --to-std-out --quiet /datadrive/postgresql/13/wal_archives/%f.xz > %p'

archive_cleanup_command = 'pg_archivecleanup -d -x .xz /datadrive/postgresql/13/wal_archives %r >> /datadrive/postgresql/13/wal_archives/archive_cleanup_command.log 2>&1'

archive_cleanup_command.log has 777 permissions. archive_cleanup_command.log 有 777 个权限。

I have a master database doing logical replication with a publication and a slave database subscribing to that publication.我有一个主数据库与一个出版物和一个订阅该出版物的从属数据库进行逻辑复制。 It is on the slave that I am intending to do the archiving and restore points.我打算在从站上进行归档和还原点。

What am I expecting to happen?我期待发生什么？
The checkpoint timeout setting in the postgresql.conf file means that a restart point is created atleast every 5 mins. postgresql.conf 文件中的检查点超时设置意味着至少每 5 分钟创建一个重启点。 And the archive_timeout setting of 10 mins means that postgresql forces a logfile segment switch after every 10 mins. 10 分钟的 archive_timeout 设置意味着 postgresql 每 10 分钟强制一次日志文件段切换。 Therefore, atleast every 10 mins, a restart point is created.因此，至少每 10 分钟创建一个重启点。 Whenever a restart point is created, the archive cleanup command is run.每当创建重新启动点时，都会运行存档清理命令。 When this command is run it will clear all the .xz files older than this restart point.运行此命令时，它将清除所有早于此重启点的 .xz 文件。 Therefore the wal_archives directory should not really have .xz files older than 20mins or even 2hours....因此 wal_archives 目录不应该真的有 .xz 文件早于 20 分钟甚至 2 小时......

What is actually happening?实际发生了什么？

The /datadrive/postgresql/13/wal_archives directory piles up with lots of .xz files that never get cleared. /datadrive/postgresql/13/wal_archives目录/datadrive/postgresql/13/wal_archives了许多永远不会被清除的 .xz 文件。
cat archive_cleanup_command.log shows an empty file. cat archive_cleanup_command.log显示一个空文件。 Nothing is ever writing to it.没有什么是写给它的。

When I run the pg_archivecleanup command manually via bash, it works (ie clears all the archive files before the one specified and cat archive_cleanup_command shows the files that were cleared.当我通过 bash 手动运行 pg_archivecleanup 命令时，它可以工作（即清除指定文件之前的所有存档文件，并且 cat archive_cleanup_command 显示已清除的文件。
Example:例子：

 pg_archivecleanup -d -x .xz /datadrive/postgresql/13/wal_archives 000000010000045E000000E5 >> /datadrive/postgresql/13/wal_archives/archive_cleanup_command.log 2>&1

Then running cat archive_cleanup_command.log gives this:然后运行cat archive_cleanup_command.log给出：

 pg_archivecleanup: keeping WAL file "/datadrive/postgresql/13/wal_archives/000000010000045E000000E5" and later pg_archivecleanup: removing file "/datadrive/postgresql/13/wal_archives/000000010000045E000000DE.xz" pg_archivecleanup: removing file "/datadrive/postgresql/13/wal_archives/000000010000045E000000DF.xz" pg_archivecleanup: removing file "/datadrive/postgresql/13/wal_archives/000000010000045E000000E0.xz" pg_archivecleanup: removing file "/datadrive/postgresql/13/wal_archives/000000010000045E000000E1.xz" pg_archivecleanup: removing file "/datadrive/postgresql/13/wal_archives/000000010000045E000000E2.xz" pg_archivecleanup: removing file "/datadrive/postgresql/13/wal_archives/000000010000045E000000E3.xz" pg_archivecleanup: removing file "/datadrive/postgresql/13/wal_archives/000000010000045E000000E4.xz"

What have I tried?我尝试了什么？

I have tried various permission settings (examples: chmod 777 the wal_archive directory, add other users to the postgres group, etc...)我尝试了各种权限设置（例如：chmod 777 wal_archive 目录，将其他用户添加到 postgres 组等...）
Extensively and thoroughly read the postgresql documentation and looked atleast 20 different related stackoverflow posts.广泛而彻底地阅读 postgresql 文档，并查看了至少 20 个不同的相关 stackoverflow 帖子。
Initially tried 7zip cmd line tool to do the zipping instead of pxz.最初尝试使用 7zip cmd line 工具来进行压缩而不是 pxz。
Successfully restarted the database multiple times using the following commands:使用以下命令多次成功重启数据库：
```
 sudo systemctl stop postgresql@13-main sudo systemctl start postgresql@13-main
```
Dropped the logical replication and re-created the publication on the master and subscription on the slave.删除逻辑复制并在主服务器上重新创建发布和从服务器上的订阅。
Enabled checkpoints on the master itself.在 master 本身上启用检查点。
Looked at /var/log/postgresql/postgresql-13-main.log .查看/var/log/postgresql/postgresql-13-main.log 。 Unfortunately no relevant errors show up in this log.不幸的是，此日志中没有显示相关错误。

Answer 1

Restartpoints, restore_command and archive_cleanup_command only apply to streaming ("physical") replication, or to recovery in general, not to logical replication. Restartpoints、 restore_command和archive_cleanup_command仅适用于流（“物理”）复制，或一般恢复，不适用于逻辑复制。

A logical replication standby is not in recovery, it is open for reading and writing.逻辑复制备用数据库未处于恢复状态，它处于打开状态以供读取和写入。 In that status, recovery settings like archive_cleanup_command are ignored.在该状态下，会忽略诸如archive_cleanup_command类的恢复设置。

You will have to find another mechanism to delete old WAL archives, ideally in combination with your backup solution.您必须找到另一种机制来删除旧的 WAL 档案，最好与您的备份解决方案结合使用。

archive_cleanup_command 不清除存档的 wal 文件

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-11-17 06:42:14

archive_cleanup_command 不清除存档的 wal 文件

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-11-17 06:42:14

解决方案1
1 已采纳 2020-11-17 06:42:14