简体   繁体   English

Docker 进程未运行,并且与 Docker 的任何交互都失败

[英]Docker process not running, and any interaction with docker fails

To preface this, I am running docker in a Ubuntu 20.04 VM on virtualbox.首先,我在 virtualbox 上的 Ubuntu 20.04 VM 中运行 docker。

I created a simple shell script to kill any process running on port 9042, then start my docker-compose file.我创建了一个简单的 shell 脚本来终止在端口 9042 上运行的任何进程,然后启动我的 docker-compose 文件。 Here is the script in question:这是有问题的脚本:

#!/bin/bash

# Check for and kill any processes running on port 9042
sudo kill -9 $(sudo lsof -t -i:9042)

# start docker-compose
docker-compose -f ./docker/docker-compose.yml up

Since running that, however, it has made my docker installation completely unresponsive to any sort of interaction.然而,自从运行它以来,它使我的 docker 安装对任何类型的交互都完全没有响应。 Any docker commands will hang indefinitely until cancelled with Ctrl+C, and any other system commands that use docker (such as sudo service docker start ) will also hang indefinitely.任何 docker 命令都将无限期挂起,直到用 Ctrl+C 取消,任何其他使用 docker 的系统命令(例如sudo service docker start )也将无限期挂起。

If I try to run dockerd , it fails with the message failed to start daemon: pid file found, ensure docker is not running or delete /var/run/docker.pid .如果我尝试运行dockerd ,它会失败并显示消息failed to start daemon: pid file found, ensure docker is not running or delete /var/run/docker.pid As my system reports that docker is not running, I go ahead and delete var/run/docker.pid .当我的系统报告var/run/docker.pid没有运行时,我继续删除var/run/docker.pid If I then try to run dockerd again, I get a different error message: failed to start daemon: error while opening volume store metadata database: timeout .如果我然后再次尝试运行 dockerd,我会收到一条不同的错误消息: failed to start daemon: error while opening volume store metadata database: timeout

At this stage, some of the docker commands start working again.在这个阶段,一些 docker 命令再次开始工作。 docker version and docker help both work, but it is still reported that the docker daemon is not running. docker versiondocker help都可以工作,但仍然报告 docker daemon 没有运行。 Attempting to run docker-compose up on a docker-compose file produces this output:尝试在 docker-compose 文件上运行docker-compose up会产生以下输出:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 387, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.8/http/client.py", line 1255, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1250, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1010, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 950, in send
    self.connect()
  File "/home/david/.local/lib/python3.8/site-packages/docker/transport/unixconn.py", line 43, in connect
    sock.connect(self.unix_socket)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 719, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 400, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3/dist-packages/six.py", line 702, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 387, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.8/http/client.py", line 1255, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1250, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1010, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 950, in send
    self.connect()
  File "/home/david/.local/lib/python3.8/site-packages/docker/transport/unixconn.py", line 43, in connect
    sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionRefusedError(111, 'Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/david/.local/lib/python3.8/site-packages/docker/api/client.py", line 205, in _retrieve_server_version
    return self.version(api_version=False)["ApiVersion"]
  File "/home/david/.local/lib/python3.8/site-packages/docker/api/daemon.py", line 181, in version
    return self._result(self._get(url), json=True)
  File "/home/david/.local/lib/python3.8/site-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/home/david/.local/lib/python3.8/site-packages/docker/api/client.py", line 228, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionRefusedError(111, 'Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/david/.local/bin/docker-compose", line 8, in <module>
    sys.exit(main())
  File "/home/david/.local/lib/python3.8/site-packages/compose/cli/main.py", line 67, in main
    command()
  File "/home/david/.local/lib/python3.8/site-packages/compose/cli/main.py", line 123, in perform_command
    project = project_from_options('.', options)
  File "/home/david/.local/lib/python3.8/site-packages/compose/cli/command.py", line 60, in project_from_options
    return get_project(
  File "/home/david/.local/lib/python3.8/site-packages/compose/cli/command.py", line 131, in get_project
    client = get_client(
  File "/home/david/.local/lib/python3.8/site-packages/compose/cli/docker_client.py", line 41, in get_client
    client = docker_client(
  File "/home/david/.local/lib/python3.8/site-packages/compose/cli/docker_client.py", line 170, in docker_client
    client = APIClient(**kwargs)
  File "/home/david/.local/lib/python3.8/site-packages/docker/api/client.py", line 188, in __init__
    self._version = self._retrieve_server_version()
  File "/home/david/.local/lib/python3.8/site-packages/docker/api/client.py", line 212, in _retrieve_server_version
    raise DockerException(
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', ConnectionRefusedError(111, 'Connection refused'))

Other system commands such as sudo service docker start still hang indefinitely until killed.其他系统命令,例如sudo service docker start仍然无限期地挂起,直到被杀死。

I have tried every single solution in this thread ( Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? ) and this one ( Docker commands do not respond anymore ), but none of them work.我已经尝试了这个线程中的每一个解决方案( 无法连接到 unix:/var/run/docker.sock 上的 Docker 守护进程。docker 守护进程是否正在运行? )和这个( Docker 命令不再响应),但没有一个他们工作。

Does anyone know what could be the issue here?有谁知道这里可能是什么问题?

EDIT: A few more points -编辑:还有几点 -

  • The docker.pid file reappears again when I restart my VM重新启动 VM 时, docker.pid文件再次出现
  • Restarting my VM does not do anything to remedy the problem重新启动我的虚拟机并不能解决问题
  • Executing commands as the root user likewise doesn't do anything以 root 用户身份执行命令同样不会执行任何操作
  • Trying to reinstall docker using sudo apt-get install --reinstall docker-ce also hangs at the stage Preparing to unpack .../docker-ce_5%3a20.10.0~1.1.beta1-0~ubuntu-focal_amd64.deb ...尝试使用sudo apt-get install --reinstall docker-ce重新sudo apt-get install --reinstall docker-ce也挂在阶段Preparing to unpack .../docker-ce_5%3a20.10.0~1.1.beta1-0~ubuntu-focal_amd64.deb ...

I know this is super late, but I found an answer from another similar question I asked.我知道这太晚了,但我从我问的另一个类似问题中找到了答案。

Docker containers are stored in the default location at /var/lib/docker/ on Linux. Docker 容器存储在 Linux 上的/var/lib/docker/的默认位置。 I was able to identify the container that was causing the issue and deleted the actual container files.我能够确定导致问题的容器并删除了实际的容器文件。 I then used the CLI to remove all other traces of the container, and docker was able to start running as normal.然后我使用 CLI 删除了容器的所有其他痕迹,并且 docker 能够正常开始运行。

Obviously doing this is risky, so make sure you take adequate steps to back up your machine first.显然这样做是有风险的,因此请确保先采取足够的步骤来备份您的机器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM