简体   繁体   中英

PostgreSQL 9.6 understanding wal files

I am trying to understand the behaviour of wal files. The wal related settings of the database are as follows:

"min_wal_size"  "2GB"   
"max_wal_size"  "20GB"
"wal_segment_size"  "16MB"
"wal_keep_segments" "0"
"checkpoint_completion_target"  "0.8"
"checkpoint_timeout"    "15min"

The number of wal files is always 1281 or higher:

SELECT COUNT(*) FROM pg_ls_dir('pg_xlog') WHERE pg_ls_dir ~ '^[0-9A-F]{24}';
-- count 1281

As I understand it this means wal files currently never fall below max_wal_size (1281 * 16 MB = 20496 MB = max_wal_size) ??

I would expect the number of wal files to decrease below maximum right after a checkpoint is reached and data is synced to disk. But this is clearly not the case. What am I missing?

As per the documentation (emphasis added):

The number of WAL segment files in pg_xlog directory depends on min_wal_size , max_wal_size and the amount of WAL generated in previous checkpoint cycles. When old log segment files are no longer needed, they are removed or recycled (that is, renamed to become future segments in the numbered sequence) . If, due to a short-term peak of log output rate, max_wal_size is exceeded, the unneeded segment files will be removed until the system gets back under this limit. Below that limit, the system recycles enough WAL files to cover the estimated need until the next checkpoint, and removes the rest

So, as per your observation, you are probably observing the "recycle" effect -- the old WAL files are getting renamed instead of getting removed. This saves the disk some I/O, especially on busy systems.

Bear in mind that once a particular file has been recycled, it will not be reconsidered for removal/recycle again until it has been used (ie, the relevant LSN is reached and checkpoint ed). That may take a long time if your system suddenly becomes less active.

If your server is very busy and then abruptly becomes mostly idle, you can get into a situation where the log fails remain at max_wal_size for a very long time. At the time it was deciding whether to remove or recycle the files, it was using them up quickly and so decided to recycle up to max_wal_size for predicted future use , rather than remove them. Once recycled, they will never get removed until they have been used (you could argue that that is a bug), and if the server is now mostly idle it will take a very long time for them to be used and thus removed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM