简体   繁体   English

Docker 容器在运行 docker-compose 时拒绝通信 - Gitlab CI/CD

[英]Docker containers refuse to communicate when running docker-compose in dind - Gitlab CI/CD

I am trying to set up some integration tests in Gitlab CI/CD - in order to run these tests, I want to reconstruct my system (several linked containers) using the Gitlab runner and docker-compose up.我正在尝试在 Gitlab CI/CD 中设置一些集成测试 - 为了运行这些测试,我想使用 Gitlab 运行器和 ZBAEDB53E845AE71F13945AEFCC00572 重建我的系统(几个链接的容器)。 My system is composed of several containers that communicate with each other through mqtt, and an InfluxDB container which is queried by other containers.我的系统由几个通过 mqtt 相互通信的容器和一个被其他容器查询的 InfluxDB 容器组成。

I've managed to get to a point where the runner actually executes the docker-compose up and creates all the relevant containers.我已经设法让跑步者实际执行 docker-compose 并创建所有相关容器。 This is my.gitlab-ci.yml file:这是 my.gitlab-ci.yml 文件:

    image: docker:19.03

    variables:
      DOCKER_DRIVER: overlay2
      DOCKER_TLS_CERTDIR: "/certs"

    services:
      - name: docker:19.03-dind
        alias: localhost

    before_script:
      - docker info

    integration-tests:
      stage: test
      script:
        - apk add --no-cache docker-compose
        - docker-compose -f "docker-compose.replay.yml" up -d --build
        - docker exec moderator-monitor_datareplay_1 bash -c 'cd src ; python integration_tests.py'

As you can see, I am installing docker-compose, running compose up on my config yml file and then executing my integration tests from within one of the containers.如您所见,我正在安装 docker-compose,在我的配置 yml 文件上运行 compose up,然后从其中一个容器中执行我的集成测试。 When I run that final line on my local system, the integration tests run as expected;当我在本地系统上运行最后一行时,集成测试按预期运行; in the CI/CD environment, however, all the tests throw some variation of ConnectionRefusedError: [Errno 111] Connection refused errors.然而,在 CI/CD 环境中,所有测试都会引发ConnectionRefusedError: [Errno 111] Connection refused错误的一些变化。 Running docker-compose ps seems to show all the relevant containers Up and healthy.运行docker-compose ps似乎显示所有相关容器都运行正常。

I have found that the issues stem from every time one container tries to communicate with another, through lines like self.localClient = InfluxDBClient("influxdb", 8086, database = "replay") or client.connect("mosquitto", 1883, 60) .我发现问题源于每次一个容器尝试与另一个容器通信时,通过self.localClient = InfluxDBClient("influxdb", 8086, database = "replay")client.connect("mosquitto", 1883, 60) This works fine on my local docker environment as the address names resolve to the other containers that are running, but seems to be creating problems in this Docker-in-Docker setup.这在我的本地 docker 环境中运行良好,因为地址名称解析为正在运行的其他容器,但似乎在这个 Docker-in-Docker 设置中产生了问题。 Does anyone have any suggestions?有没有人有什么建议? Do containers in this dind environment have different names?这个dind环境中的容器有不同的名称吗?

It is also worth mentioning that this could be a problem with my docker-compose.yml file not being configured correctly to start healthy containers.还值得一提的是,这可能是我的 docker-compose.yml 文件未正确配置以启动健康容器的问题。 docker-compose ps suggests they are up, but is there a better way to check whether they are running correctly? docker-compose ps表明它们已启动,但有没有更好的方法来检查它们是否正常运行? Here's an excerpt of my docker-compose file:这是我的 docker-compose 文件的摘录:

services:
    datareplay:
      networks:
        - web
        - influxnet
        - brokernet
      image: data-replay
      build:
        context: data-replay
      volumes:
        - ./data-replay:/data-replay

    mosquitto:
      image: eclipse-mosquitto:latest
      hostname: mosquitto
      networks:
        - web
        - brokernet

networks:
  web:
  influxnet:
    internal: true
  brokernet:
    driver: bridge
    internal: true

There are a few possibilities to why this error is occurring:发生此错误的原因有几种可能性:

  1. A bug on Docker 19.03-dind is known to be problematic and unable to create networks when using services without a proper TLS setup , have you correctly set up your Gitlab Runner with TLS certificates?已知 Docker 19.03-dind上的一个错误是有问题的,并且在使用没有正确 TLS 设置的服务时无法创建网络,您是否使用 TLS 证书正确设置了 Gitlab Runner? I've noticed you are using "/certs" on your gitlab-ci.yml , did you mount your runner to share the volume where the certificates are stored?我注意到您在gitlab-ci.yml上使用"/certs" ,您是否安装了跑步者来共享存储证书的卷?

  2. If your Gitlab Runner is not running with privileged permissions or correctly configured to use the remote machine's network socket, you won't be able to create networks.如果您的 Gitlab Runner 未以特权权限运行或未正确配置为使用远程计算机的网络套接字,您将无法创建网络。 A simple solution to unify your networks to run in a CI/CD environment is to configure your machine using this docker-compose followed by this script .统一网络以在 CI/CD 环境中运行的简单解决方案是使用此 docker-compose此脚本配置您的计算机。 ( Source ) It'll setup a local network where you can communicate between containers using hostnames in a network where the network driver is bridged. ( Source ) 它将设置一个本地网络,您可以在其中使用网络驱动程序桥接的网络中的主机名在容器之间进行通信。

  3. There's an issue with gitlab-ci.yml as well, when you execute this part of the script:当您执行这部分脚本时, gitlab-ci.yml也存在问题:

     services: - name: docker:19.03-dind alias: localhost integration-tests: stage: test script: - apk add --no-cache docker-compose - docker-compose -f "docker-compose.replay.yml" up -d --build - docker exec moderator-monitor_datareplay_1 bash -c 'cd src; python integration_tests.py'

You're renaming your docker hostname to localhost, but you never use it, instead you type directly to use the docker and docker-compose from your image, binding them to a different network set of networks than the ones created by Gitlab automatically. You're renaming your docker hostname to localhost, but you never use it, instead you type directly to use the docker and docker-compose from your image, binding them to a different network set of networks than the ones created by Gitlab automatically.

Let's try this solution (Albeit I couldn't test it right now so I apologize if it doesn't work right away):让我们试试这个解决方案(虽然我现在无法测试它,所以如果它不能立即工作,我很抱歉):

gitlab-ci.yml

image: docker/compose:debian-1.28.5 # You should be running as a privileged Gitlab Runner
services:
  - docker:dind
integration-tests:
  stage: test
  script:
    #- apk add --no-cache docker-compose
    - docker-compose -f "docker-compose.replay.yml" up -d --build
    - docker exec moderator-monitor_datareplay_1 bash -c 'cd src ; python integration_tests.py'

docker-compose.yml

services:
  datareplay:
    networks:
      - web
      - influxnet
      - brokernet
    image: data-replay
    build:
      context: data-replay
  # volumes: You're mounting your volume to an ephemeral folder, which is in the CI pipeline and will be wiped afterwards (if you're using Docker-DIND)
   #  - ./data-replay:/data-replay
  mosquitto:
    image: eclipse-mosquitto:latest
    hostname: mosquitto
    networks:
      - web
      - brokernet

networks:
  web: # hostnames are created automatically, you don't need to specify a local setup through localhost
  influxnet:
  brokernet:
    driver: bridge #If you're using a bridge driver, an overlay2 doesn't make sense
  

Both of this commands will install a Gitlab Runner as Docker containers without the hassle of having to configure them manually to allow for socket binding on your project.这两个命令都将安装 Gitlab Runner 作为 Docker 容器,而无需手动配置它们以允许项目上的套接字绑定。

(1) : (1) :

docker run --detach --name gitlab-runner --restart always -v /srv/gitlab-runner/config:/etc/gitlab-runner -v /var/run/docker.sock:/var/run/docker.sock gitlab/gitlab-runner:latest

And then (2) :然后(2)

docker run --rm -v /srv/gitlab-runner/config:/etc/gitlab-runner gitlab/gitlab-runner register --non-interactive --description "monitoring cluster instance" --url "https://gitlab.com" --registration-token "replacethis" --executor "docker"  --docker-image "docker:latest" --locked=true  --docker-privileged=true --docker-volumes /var/run/docker.sock:/var/run/docker.sock 

Remember to change your token on the (2) command.请记住在(2)命令中更改您的令牌。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM