[英]Importing R CRAN, Bioconductor and github R packages in one Dockerfile
我很抱歉,因为我认为这可能是一个简单的问题,但我真的很难理解!
As a background, I am trying to create a Dockerfile
which installs a lot of R CRAN
and R Bioconductor
packages as well as some R packages from Github
. 我想尽快完成此操作,因此我使用rocker
的基础映像来安装二进制文件,请参阅此处以获取出色的快速教程: https://datawookie.dev/blog/2019/01/docker-images -for-rr-base-vs-r-apt/
我的方法是首先将我所有必要的包安装为二进制文件,如果有的话,从源代码安装它们。 之后,我使用Bioconductor
基础镜像安装必要的Bioconductor
软件包。
但是,在我导入Bioconductor
基础镜像后,我通过rocker
基础镜像安装的软件包不可用。 这是我觉得我对创建 Dockerfile 没有清晰理解的地方,但我似乎无法在任何文档中找到答案。 导入另一个图像后有什么方法可以复制这些吗? 我不知道这是否有必要,我看到其他人也是这样做的,比如这里的问题海报: Minimizing the size of docker image R Z531704A02607A1646EFCF4C1FAE1EECZ6
需要注意的是,我导入了Bioconductor
基础映像,因为我认为这将有助于处理依赖问题。 我想我可以安装Bioconductor
软件包,例如 R 软件包,这些软件包不能作为二进制文件提供,但我想尽可能快速和干净地执行此操作,我认为这会减慢速度。
Essentially, I want to know what's the quickest way to install, R binaries, R non-binaries, R bioconductor and github packages all in one dockerfile.
下面是我的方法的一个示例,其中包含我需要的一小部分软件包。 Note, I have shown my full approach to install R binaries, R non-binaries, R bioconductor and github packages but for the issue I am having see what happens to the tidyverse
R package before and after I import the Bioconductor
image; 调用library(tidyverse)
之前运行但之后失败:
Dockerfile
## Use r-ubuntu, prev r-apt:bionic to enable the use of binary r packages for speed for R 4.0
FROM rocker/r-ubuntu:18.04
## Install available binaries - for speed
RUN apt-get update && \
apt-get install -y -qq \
r-cran-tidyverse \
r-cran-ids \
r-cran-snow
## Install remaining packages from source
COPY ./requirements-src.R .
RUN Rscript requirements-src.R
## This works
RUN R -e 'library(tidyverse)'
## Install Bioconductor packages
# Docker inheritance
FROM bioconductor/bioconductor_docker:RELEASE_3_12
COPY ./requirements-bioc.R .
#Don't bother running for speed but this will run
#RUN R -e 'BiocManager::install(ask = F)' && Rscript requirements-bioc.R
#This will fail - can't find the package
RUN R -e 'library(tidyverse)'
## Install from GH the following
#Don't bother running for speed but this will run
#RUN installGithub.r mojaveazure/loomR
EXPOSE 8787
## Make R the default
CMD [”R”]
要求-src.R
pkgs <- c(
'spelling',
'english',
'DT'
)
install.packages(pkgs)
要求-bioc.R
bioc_pkgs<-c(
'biomaRt',
'DropletUtils',
'rhdf5'
)
BiocManager::install(bioc_pkgs,ask=F)
只是为了其他面临类似问题的人的利益,我将发布我的解决方案。 我并不是说这是唯一的解决方案,所以如果其他人找到更好的选择,我会更新它。
In the end my approach to creating docker image which installs a lot of R CRAN and R Bioconductor packages as well as some R packages from Github was:
我的解决方案按此顺序使用此步骤,并且应该证明是一种快速有效的解决方案(我的用例是 R package,它需要来自 CRAN、Bioconductor 和 ZE1ADBCBB92C622D0B3E619F 的 80 多个其他软件包作为运行时依赖项。原版),另外,由于我们使用的是最新版本的 Rocker RStudio 和软件包。 这应该与最新版本的软件和软件包保持同步。
Dockerfile 看起来像这样:
#LABEL maintainer="John Doe"
## Use rstudio installs binaries from RStudio's RSPM service by default,
## Uses the latest stable ubuntu, R and Bioconductor versions
FROM rocker/rstudio
## Add packages dependencies - from Bioconductor
RUN apt-get update \
&& apt-get install -y --no-install-recommends apt-utils \
&& apt-get install -y --no-install-recommends \
## Basic deps
gdb \
libxml2-dev \
python3-pip \
libz-dev \
liblzma-dev \
libbz2-dev \
libpng-dev \
libgit2-dev \
## sys deps from bioc_full
pkg-config \
fortran77-compiler \
byacc \
automake \
curl \
## This section installs libraries
libpcre2-dev \
libnetcdf-dev \
libhdf5-serial-dev \
libfftw3-dev \
libopenbabel-dev \
libopenmpi-dev \
libxt-dev \
libudunits2-dev \
libgeos-dev \
libproj-dev \
libcairo2-dev \
libtiff5-dev \
libreadline-dev \
libgsl0-dev \
libgslcblas0 \
libgtk2.0-dev \
libgl1-mesa-dev \
libglu1-mesa-dev \
libgmp3-dev \
libhdf5-dev \
libncurses-dev \
libbz2-dev \
libxpm-dev \
liblapack-dev \
libv8-dev \
libgtkmm-2.4-dev \
libmpfr-dev \
libmodule-build-perl \
libapparmor-dev \
libprotoc-dev \
librdf0-dev \
libmagick++-dev \
libsasl2-dev \
libpoppler-cpp-dev \
libprotobuf-dev \
libpq-dev \
libperl-dev \
## software - perl extensions and modules
libarchive-extract-perl \
libfile-copy-recursive-perl \
libcgi-pm-perl \
libdbi-perl \
libdbd-mysql-perl \
libxml-simple-perl \
libmysqlclient-dev \
default-libmysqlclient-dev \
libgdal-dev \
## new libs
libglpk-dev \
## Databases and other software
sqlite \
openmpi-bin \
mpi-default-bin \
openmpi-common \
openmpi-doc \
tcl8.6-dev \
tk-dev \
default-jdk \
imagemagick \
tabix \
ggobi \
graphviz \
protobuf-compiler \
jags \
## Additional resources
xfonts-100dpi \
xfonts-75dpi \
biber \
libsbml5-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
#install R CRAN binary packages
RUN install2.r -e \
testthat
## Install remaining packages from source
COPY ./requirements-src.R .
RUN Rscript requirements-src.R
## Install Bioconductor packages
COPY ./requirements-bioc.R .
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libfftw3-dev \
gcc && apt-get clean \
&& rm -rf /var/lib/apt/lists/*
RUN Rscript -e 'requireNamespace("BiocManager"); BiocManager::install(ask=F);' \
&& Rscript requirements-bioc.R
## Install from GH the following
RUN installGithub.r theislab/kBET \
chris-mcginnis-ucsf/DoubletFinder \
请注意,来自源的 CRAN 包和 Bioconductor 包保存在与 Dockerfile 相同的文件夹中的单独脚本中。
要求-src.R:
pkgs <- c(
'spelling',
'english',
'Seurat')
install.packages(pkgs)
要求-bioc.R:
bioc_pkgs<-c(
'biomaRt',
'SingleCellExperiment',
'SummarizedExperiment')
requireNamespace("BiocManager")
BiocManager::install(bioc_pkgs,ask=F)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.