简体   繁体   中英

Docker containers refuse to communicate when running docker-compose in dind - Gitlab CI/CD

I am trying to set up some integration tests in Gitlab CI/CD - in order to run these tests, I want to reconstruct my system (several linked containers) using the Gitlab runner and docker-compose up. My system is composed of several containers that communicate with each other through mqtt, and an InfluxDB container which is queried by other containers.

I've managed to get to a point where the runner actually executes the docker-compose up and creates all the relevant containers. This is my.gitlab-ci.yml file:

    image: docker:19.03

    variables:
      DOCKER_DRIVER: overlay2
      DOCKER_TLS_CERTDIR: "/certs"

    services:
      - name: docker:19.03-dind
        alias: localhost

    before_script:
      - docker info

    integration-tests:
      stage: test
      script:
        - apk add --no-cache docker-compose
        - docker-compose -f "docker-compose.replay.yml" up -d --build
        - docker exec moderator-monitor_datareplay_1 bash -c 'cd src ; python integration_tests.py'

As you can see, I am installing docker-compose, running compose up on my config yml file and then executing my integration tests from within one of the containers. When I run that final line on my local system, the integration tests run as expected; in the CI/CD environment, however, all the tests throw some variation of ConnectionRefusedError: [Errno 111] Connection refused errors. Running docker-compose ps seems to show all the relevant containers Up and healthy.

I have found that the issues stem from every time one container tries to communicate with another, through lines like self.localClient = InfluxDBClient("influxdb", 8086, database = "replay") or client.connect("mosquitto", 1883, 60) . This works fine on my local docker environment as the address names resolve to the other containers that are running, but seems to be creating problems in this Docker-in-Docker setup. Does anyone have any suggestions? Do containers in this dind environment have different names?

It is also worth mentioning that this could be a problem with my docker-compose.yml file not being configured correctly to start healthy containers. docker-compose ps suggests they are up, but is there a better way to check whether they are running correctly? Here's an excerpt of my docker-compose file:

services:
    datareplay:
      networks:
        - web
        - influxnet
        - brokernet
      image: data-replay
      build:
        context: data-replay
      volumes:
        - ./data-replay:/data-replay

    mosquitto:
      image: eclipse-mosquitto:latest
      hostname: mosquitto
      networks:
        - web
        - brokernet

networks:
  web:
  influxnet:
    internal: true
  brokernet:
    driver: bridge
    internal: true

There are a few possibilities to why this error is occurring:

  1. A bug on Docker 19.03-dind is known to be problematic and unable to create networks when using services without a proper TLS setup , have you correctly set up your Gitlab Runner with TLS certificates? I've noticed you are using "/certs" on your gitlab-ci.yml , did you mount your runner to share the volume where the certificates are stored?

  2. If your Gitlab Runner is not running with privileged permissions or correctly configured to use the remote machine's network socket, you won't be able to create networks. A simple solution to unify your networks to run in a CI/CD environment is to configure your machine using this docker-compose followed by this script . ( Source ) It'll setup a local network where you can communicate between containers using hostnames in a network where the network driver is bridged.

  3. There's an issue with gitlab-ci.yml as well, when you execute this part of the script:

     services: - name: docker:19.03-dind alias: localhost integration-tests: stage: test script: - apk add --no-cache docker-compose - docker-compose -f "docker-compose.replay.yml" up -d --build - docker exec moderator-monitor_datareplay_1 bash -c 'cd src; python integration_tests.py'

You're renaming your docker hostname to localhost, but you never use it, instead you type directly to use the docker and docker-compose from your image, binding them to a different network set of networks than the ones created by Gitlab automatically.

Let's try this solution (Albeit I couldn't test it right now so I apologize if it doesn't work right away):

gitlab-ci.yml

image: docker/compose:debian-1.28.5 # You should be running as a privileged Gitlab Runner
services:
  - docker:dind
integration-tests:
  stage: test
  script:
    #- apk add --no-cache docker-compose
    - docker-compose -f "docker-compose.replay.yml" up -d --build
    - docker exec moderator-monitor_datareplay_1 bash -c 'cd src ; python integration_tests.py'

docker-compose.yml

services:
  datareplay:
    networks:
      - web
      - influxnet
      - brokernet
    image: data-replay
    build:
      context: data-replay
  # volumes: You're mounting your volume to an ephemeral folder, which is in the CI pipeline and will be wiped afterwards (if you're using Docker-DIND)
   #  - ./data-replay:/data-replay
  mosquitto:
    image: eclipse-mosquitto:latest
    hostname: mosquitto
    networks:
      - web
      - brokernet

networks:
  web: # hostnames are created automatically, you don't need to specify a local setup through localhost
  influxnet:
  brokernet:
    driver: bridge #If you're using a bridge driver, an overlay2 doesn't make sense
  

Both of this commands will install a Gitlab Runner as Docker containers without the hassle of having to configure them manually to allow for socket binding on your project.

(1) :

docker run --detach --name gitlab-runner --restart always -v /srv/gitlab-runner/config:/etc/gitlab-runner -v /var/run/docker.sock:/var/run/docker.sock gitlab/gitlab-runner:latest

And then (2) :

docker run --rm -v /srv/gitlab-runner/config:/etc/gitlab-runner gitlab/gitlab-runner register --non-interactive --description "monitoring cluster instance" --url "https://gitlab.com" --registration-token "replacethis" --executor "docker"  --docker-image "docker:latest" --locked=true  --docker-privileged=true --docker-volumes /var/run/docker.sock:/var/run/docker.sock 

Remember to change your token on the (2) command.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM