简体   繁体   English

docker-compose 健康检查重试频率 != 间隔

[英]docker-compose healthcheck retry frequency != interval

I recently set up healthcheck s in my docker-compose config.我最近在我的docker-compose配置中设置了healthcheck

It is doing great and I like it.它做得很好,我喜欢它。 Here's a typical example:下面是一个典型的例子:

services:
  app:
    healthcheck:
      test: curl -sS http://127.0.0.1:4000 || exit 1
      interval: 5s
      timeout: 3s
      retries: 3
      start_period: 30s

My container is quite slow to boot, hence I set up a 30 seconds start_period .我的容器启动很慢,因此我设置了 30 秒start_period

But it doesn't really fit my expectation: I don't need check every 5 seconds, but I need to know when the container is ready for the first time as soon as possible for my orchestration, and since my start_period is approximative, if it is not ready yet at first check, I have to wait for interval before retry.但这并不符合我的期望:我不需要每 5 秒检查一次,但我需要知道容器何时为我的编排第一次准备就绪,并且由于我的start_period是近似值,如果第一次检查时它还没有准备好,我必须等待interval才能重试。

What I'd like to have is:我想要的是:

  • While container is not healthy, retry every 5 seconds当容器不健康时,每 5 秒重试一次
  • Once it is healthy, check every 1 minute一旦它是健康的,每 1 分钟检查一次

Ain't there a way to achieve this out-of-the-box with docker-compose ?没有办法用docker-compose实现这种开箱即用的docker-compose吗?

I could write a custom script to achieve this, but I'd rather have a native solution if it is possible.我可以编写一个自定义脚本来实现这一点,但如果可能的话,我宁愿有一个本机解决方案。

Unfortunately, this is not possible out of the box.不幸的是,这是不可能的。
All the duration set are final.所有的持续时间设置都是最终的。 They can't be changed depending on the container state.它们不能根据容器状态进行更改。

However, according to the documentation , the probe does not seem to wait for the start_period to finish before checking your test.但是,根据 文档,探针在检查测试之前似乎不会等待start_period完成。 The only thing it does is that any failure hapenning during start_period will not be considered as an error.它唯一做的就是在 start_period 期间start_period任何故障都不会被视为错误。

Below is the sentence that make me think that :下面是让我想到的一句话:

start_period provides initialization time for containers that need time to bootstrap. start_period为需要时间引导的容器提供初始化时间。 Probe failure during that period will not be counted towards the maximum number of retries.在此期间的探测失败将不计入最大重试次数。 However, if a health check succeeds during the start period, the container is considered started and all consecutive failures will be counted towards the maximum number of retries.但是,如果在启动期间健康检查成功,则认为容器已启动,所有连续失败都将计入最大重试次数。

I encourage you to test if this is really the case as I've never really paid any attention if the healthcheck is tested during the start period or not.我鼓励您测试是否真的如此,因为我从未真正关注过是否在开始期间测试了健康检查。
And if it is the case, you can probably increase your start_period if you're unsure about the duration and also increase the interval in order to find a good compromise.如果是这种情况,如果您不确定持续时间,您可能可以增加start_period并增加interval以找到一个好的折衷方案。

I wrote a script that does this, though I'd rather find a native solution:我编写了一个执行此操作的脚本,但我更愿意找到本机解决方案:

#!/bin/sh

HEALTHCHECK_FILE="/root/.healthchecked"

COMMAND=${*?"Usage: healthcheck_retry <COMMAND>"}

if [ -r "$HEALTHCHECK_FILE" ]; then
  LAST_HEALTHCHECK=$(date -r "$HEALTHCHECK_FILE" +%s)
  # FIVE_MINUTES_AGO=$(date -d 'now - 5 minutes' +%s)
  FIVE_MINUTES_AGO=$(echo "$(( $(date +%s)-5*60 ))")
  echo "Healthcheck file present";
  # if (( $LAST_HEALTHCHECK > $FIVE_MINUTES_AGO )); then
  if [ $LAST_HEALTHCHECK -gt $FIVE_MINUTES_AGO ]; then
    echo "Healthcheck too recent";
    exit 0;
  fi
fi

if $COMMAND ; then
  echo "\"$COMMAND\" succeed: updating file";
  touch $HEALTHCHECK_FILE;
  exit 0;
else
  echo "\"$COMMAND\" failed: exiting";
  exit 1;
fi

Which I use: test: /healthcheck_retry.sh curl -fsS localhost:4000/healthcheck我使用的: test: /healthcheck_retry.sh curl -fsS localhost:4000/healthcheck

The pain is that I need to make sure the script is available in every container, so I have to create an extra volume for this:痛苦的是我需要确保脚本在每个容器中可用,所以我必须为此创建一个额外的卷:

    image: postgres:11.6-alpine
    volumes:
      - ./scripts/utils/healthcheck_retry.sh:/healthcheck_retry.sh

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM