使用Docker的Julia集群

Question

I am trying to connect to docker containers using the default SSHManager. 我正在尝试使用默认的SSHManager连接到Docker容器。 These containers only have a running sshd, with public key authentication, and julia installed. 这些容器仅具有运行中的sshd，公钥身份验证和julia安装。

Here is my dockerfile: 这是我的dockerfile：

FROM rastasheep/ubuntu-sshd
RUN apt-get update && apt-get install -y julia
RUN mkdir -p /root/.ssh
ADD id_rsa.pub /root/.ssh/authorized_keys

I am running the container using: 我正在使用以下容器运行容器：

sudo docker run -d -p 3333:22 -it --name julia-sshd julia-sshd

And then in the host machine, using the julia repl, I get the following error: 然后在主机中，使用julia repl，出现以下错误：

julia> import Base:SSHManager
julia> addprocs(["root@localhost:3333"])
stdin: is not a tty
Worker 2 terminated.
ERROR (unhandled task failure): EOFError: read end of file
Master process (id 1) could not connect within 60.0 seconds.
exiting.

I have tested that I can connect to the container via ssh without password. 我已经测试过可以通过ssh无需密码即可连接到容器。

I have also tested that in julia repl I can add a regular machine with julia installed to the cluster and it works fine. 我还测试了在julia repl中，我可以向群集中添加安装了julia的常规计算机，并且工作正常。

But I cannot get this two things working together. 但是我无法使这两件事协同工作。 Any help or suggestions will be apreciated. 任何帮助或建议将不胜感激。

Answer 1

I recommend you to also deploy the Master in a Docker container. 我建议您也将Master部署在Docker容器中。 It makes your environment easily and fully reproducible. 它使您的环境轻松，完全可复制。

I'm working on a way of deploying Workers in Docker containers on-demand. 我正在研究一种按需在Docker容器中部署Workers的方法。 ie, the Master deployed in a container can deploy further DockerizedJuliaWorker s. 即，部署在容器中的Master可以进一步部署DockerizedJuliaWorker 。 It is similar to https://github.com/gsd-ufal/Infra.jl but assuming that Master and Workers run on the same host, which makes things not so hard. 它类似于https://github.com/gsd-ufal/Infra.jl，但假设Master和Workers在同一主机上运行，这使得事情变得不那么困难。

It is an on-going work and I plan to finish next weeks. 这是一项正在进行的工作，我计划在下周完成。 In a nutshell: 简而言之：

1) You'll need a simple DockerBackend and a wrapper to transparently run containers, set up SSH, and call addprocs with all the low-level parameters (ie, the DockerizedJuliaWorker.jl file): 1）您将需要一个简单的DockerBackend和一个包装器来透明地运行容器，设置SSH并使用所有低级参数（即DockerizedJuliaWorker.jl文件）调用addprocs：

https://github.com/NaelsonDouglas/DistributedMachineLearningThesis/tree/master/src/docker https://github.com/NaelsonDouglas/DistributedMachineLearningThesis/tree/master/src/docker

2) Read here how to build the Docker image (Dockerfile is included): 2）在此处阅读如何构建Docker映像（包含Dockerfile）：

https://github.com/NaelsonDouglas/DistributedMachineLearningThesis https://github.com/NaelsonDouglas/DistributedMachineLearningThesis

Please tell me if you have any suggestion on how to improve it. 如果您有任何改进建议，请告诉我。

Best, 最好，

André Lage. 安德烈·拉格（AndréLage）。

使用Docker的Julia集群

问题描述

1 个解决方案

解决方案1
2 2018-09-08 04:12:11

使用Docker的Julia集群

问题描述

1 个解决方案

解决方案1 2 2018-09-08 04:12:11

解决方案1
2 2018-09-08 04:12:11