简体   繁体   English

如何使用 R + Anaconda3 + 非 root 用户创建 Dockerfile

[英]How to create Dockerfile with R + Anaconda3 + non-root User

I need to create a Dockerfile that emulates a normal workspace.我需要创建一个模拟正常工作区的 Dockerfile。 We have a virtual machine where we train models.我们有一个虚拟机来训练模型。 We Use R and Python3.我们使用 R 和 Python3。

I want to automate some of the processes without changing the codebase.我想在不更改代码库的情况下自动化一些流程。 eg ~ must point to a /home/<some user>例如 ~ 必须指向 /home/<some user>

Biggest problem is Anaconda3 in docker.最大的问题是 docker 中的 Anaconda3。 because every RUN is a standalone login.因为每个 RUN 都是一个独立的登录。

Basis for my answer: https://github.com/xychelsea/anaconda3-docker/blob/main/Dockerfile我回答的依据: https://github.com/xychelsea/anaconda3-docker/blob/main/Dockerfile

I've created my own mini R package installer:我创建了自己的迷你 R package 安装程序:

install_r_packages.sh install_r_packages.sh

#!/bin/bash

input="r-requirements.txt"
Rscript -e "install.packages('remotes')"
IFS='='
while IFS= read -r line; do
  read -r package version <<<$line
  package=$(echo "$package" | sed 's/ *$//g')
  version=$(echo "$version" | sed 's/ *$//g')
  if ! [[ ($package =~ ^#.*) || (-z $package) ]]; then
    Rscript -e "remotes::install_version('$package', version = '$version')"
  fi
done <$input

r-requirement r-需求

# packages for rmarkdown
htmltools=0.5.2
jsonlite=1.7.2
...
rmarkdown=2.11

# more packages
...

Dockerfile Dockerfile

FROM debian:bullseye

RUN apt-get update

# install R
RUN apt-get install -y r-base r-base-dev libatlas3-base r-recommended libssl-dev openssl \
    libcurl4-openssl-dev libfontconfig1-dev libxml2-dev xml2 pandoc lua5.3 clang
ENV ARROW_S3=ON \
    LIBARROW_MINIMAL=false \
    LIBARROW_BINARY=true \
    RSTUDIO_PANDOC=/usr/lib/rstudio-server/bin/pandoc \
    TZ=Etc/UTC
COPY r-requirements.txt .
COPY scripts/install_r_packages.sh scripts/install_r_packages.sh
RUN bash scripts/install_r_packages.sh

# create user
ENV REPORT_USER="reporter"
ENV PROJECT_HOME=/home/${REPORT_USER}/<project>
RUN useradd -ms /bin/bash ${REPORT_USER} \
    && mkdir /data \
    && mkdir /opt/mlflow \
    && chown -R ${REPORT_USER}:${REPORT_USER} /data \
    && chown -R ${REPORT_USER}:${REPORT_USER} /opt/mlflow

# copy project files
WORKDIR ${PROJECT_HOME}
COPY src src
... bla bla bla ...
COPY requirements.txt .
RUN chown -R ${REPORT_USER}:${REPORT_USER} ${PROJECT_HOME}

# Install python Anaconda env
ENV ANACONDA_PATH="/opt/anaconda3"
ENV PATH=${ANACONDA_PATH}/bin:${PATH}
ENV ANACONDA_INSTALLER=Anaconda3-2021.11-Linux-x86_64.sh
RUN mkdir ${ANACONDA_PATH} \
    && chown -R ${REPORT_USER}:${REPORT_USER} ${ANACONDA_PATH}
RUN apt-get install -y wget
USER ${REPORT_USER}
RUN wget https://repo.anaconda.com/archive/${ANACONDA_INSTALLER} \
    && /bin/bash ${ANACONDA_INSTALLER} -b -u -p ${ANACONDA_PATH} \
    && chown -R ${REPORT_USER} ${ANACONDA_PATH} \
    && rm -rvf ~/${ANACONDA_INSTALLER}.sh \
    && echo ". ${ANACONDA_PATH}/etc/profile.d/conda.sh" >> ~/.bashrc \
    && echo "conda activate base" >> ~/.bashrc
RUN pip3 install --upgrade pip \
    && pip3 install -r requirements.txt \
    && pip3 install awscli

# run training and report
ENV PYTHONPATH=/home/${REPORT_USER}/<project> \
    MLFLOW_TRACKING_URI=... \
    MLFLOW_EXPERIMENT_NAME=...

CMD dvc config core.no_scm true \
    && dvc repro

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM