简体   繁体   English

如何删除 postgresql 中的旧 WAL 文件?

[英]How to remove old WAL file in postgresql?

I am using postgresql database cluster.我正在使用 postgresql 数据库集群。 I have an issue with low disk space.我遇到磁盘空间不足的问题。 After investigation I found it is happen due to WAL file.经过调查,我发现这是由于 WAL 文件而发生的。

Due to WAL file my disc space reduce dramatically.由于 WAL 文件,我的磁盘空间急剧减少。 Now I need to free up some space without loosing any data or corruption in PostgreSQL. To free up space I need to remove WAL file.现在我需要在 PostgreSQL 中释放一些空间而不丢失任何数据或损坏。要释放空间我需要删除 WAL 文件。

In my cluster has 2 standby nodes and one primary node.在我的集群中有 2 个备用节点和一个主节点。 So that, without interruption I need to do something to free some space.因此,在不中断的情况下,我需要做一些事情来释放一些空间。

What are the recommended steps need to follow to remove WAL file without any interruption in my PostgreSQL cluster?在我的 PostgreSQL 集群中不中断地删除 WAL 文件需要遵循哪些推荐步骤?

Don't remove WAL segments manually.不要手动删除 WAL 段。 Instead, find out what keeps PostgreSQL from removing them and fix that condition.相反,找出阻止 PostgreSQL 删除它们的原因并解决该问题。

There are several possibilities:有几种可能性:

  1. a stale replication slot (most likely)陈旧的复制槽(最有可能)

    Find out with this query on the primary:找出主要的这个查询:

     SELECT slot_name, active, pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS age FROM pg_replication_slots;

    If there is a slot with a high age, that is your problem.如果有一个高年龄的插槽,那是你的问题。

    Examine the standby whose slot is behind and look into its log to find out why it is not replicating.检查插槽落后的备用数据库并查看其日志以找出它不进行复制的原因。 Either fix that problem so that the standby can catch up or abandon that standby by dropping the replication slot:解决该问题,以便备用数据库可以通过删除复制槽来赶上或放弃该备用数据库:

     SELECT pg_drop_replication_slot('bad_slot');
  2. the archiver got stuck归档器卡住了

    Examine the contents of pg_stat_archiver on the primary.检查主服务器上pg_stat_archiver的内容。 If that tells you that the archiver has problems, look at the log file to see detailed error messages.如果这告诉您存档器有问题,请查看日志文件以查看详细的错误消息。 Fix the problem so that archiving can resume.解决问题,以便可以恢复存档。

    If you want to stop archiving (which will break your backup,), you can set archive_command to something like /bin/true and reload.如果你想停止存档(这会破坏你的备份),你可以将archive_command设置为/bin/true之类的东西并重新加载。

  3. a much too high wal_keep_size / wal_keep_segments太高的wal_keep_size / wal_keep_segments

    If that parameter on the primary is your problem, simply reduce the value and reload.如果主参数上的那个参数是你的问题,只需减少值并重新加载。

Once you have fixed the problem, WAL will get removed.解决问题后,WAL 将被删除。 That can take a while, since WAL is removed during checkpoints.这可能需要一段时间,因为 WAL 在检查点期间被删除。 You can force a checkpoint with the CHECKPOINT SQL statement.您可以使用CHECKPOINT SQL 语句强制检查点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM