[英]Importing R CRAN, Bioconductor and github R packages in one Dockerfile
My apologies because I think this may be a simple question but it is something that I am really struggling to understand!我很抱歉,因为我认为这可能是一个简单的问题,但我真的很难理解!
As a background, I am trying to create a Dockerfile
which installs a lot of R CRAN
and R Bioconductor
packages as well as some R packages from Github
. As a background, I am trying to create a
Dockerfile
which installs a lot of R CRAN
and R Bioconductor
packages as well as some R packages from Github
. I want to do this as quickly as possible so I'm using rocker
's base image to install binary files, see here for a great, quick tutorial: https://datawookie.dev/blog/2019/01/docker-images-for-rr-base-versus-r-apt/我想尽快完成此操作,因此我使用
rocker
的基础映像来安装二进制文件,请参阅此处以获取出色的快速教程: https://datawookie.dev/blog/2019/01/docker-images -for-rr-base-vs-r-apt/
My approach is first to install all my necessary packages as binaries and, if any are not available install them from source.我的方法是首先将我所有必要的包安装为二进制文件,如果有的话,从源代码安装它们。 After this, I use the
Bioconductor
base image to install the necessary Bioconductor
packages.之后,我使用
Bioconductor
基础镜像安装必要的Bioconductor
软件包。
However, the packages I installed through the rocker
base image aren't available after I import the Bioconductor
base image.但是,在我导入
Bioconductor
基础镜像后,我通过rocker
基础镜像安装的软件包不可用。 This is where I feel I don't have a clear understanding of creating Dockerfiles but I can't seem to find an answer in any documentation.这是我觉得我对创建 Dockerfile 没有清晰理解的地方,但我似乎无法在任何文档中找到答案。 Is there some way to copy these over after importing another image?
导入另一个图像后有什么方法可以复制这些吗? I didn't know if this is necessary, I have seen others do it the same way, such as the question poster here: Minimizing the size of docker image R shiny app
我不知道这是否有必要,我看到其他人也是这样做的,比如这里的问题海报: Minimizing the size of docker image R Z531704A02607A1646EFCF4C1FAE1EECZ6
To note, I import the Bioconductor
base image as I thought this would help deal with dependency issues.需要注意的是,我导入了
Bioconductor
基础映像,因为我认为这将有助于处理依赖问题。 I guess I could just install the Bioconductor
packages like the R packages that weren't available as binaries but I want to do this as quickly and cleanly as possible and I thought that this would slow things down.我想我可以安装
Bioconductor
软件包,例如 R 软件包,这些软件包不能作为二进制文件提供,但我想尽可能快速和干净地执行此操作,我认为这会减慢速度。
Essentially, I want to know what's the quickest way to install, R binaries, R non-binaries, R bioconductor and github packages all in one dockerfile. Essentially, I want to know what's the quickest way to install, R binaries, R non-binaries, R bioconductor and github packages all in one dockerfile.
An example of my approach is below with a very small subset of the packages I need.下面是我的方法的一个示例,其中包含我需要的一小部分软件包。 Note, I have shown my full approach to install R binaries, R non-binaries, R bioconductor and github packages but for the issue I am having see what happens to the
tidyverse
R package before and after I import the Bioconductor
image; Note, I have shown my full approach to install R binaries, R non-binaries, R bioconductor and github packages but for the issue I am having see what happens to the
tidyverse
R package before and after I import the Bioconductor
image; the call library(tidyverse)
runs before but fails after:调用
library(tidyverse)
之前运行但之后失败:
Dockerfile Dockerfile
## Use r-ubuntu, prev r-apt:bionic to enable the use of binary r packages for speed for R 4.0
FROM rocker/r-ubuntu:18.04
## Install available binaries - for speed
RUN apt-get update && \
apt-get install -y -qq \
r-cran-tidyverse \
r-cran-ids \
r-cran-snow
## Install remaining packages from source
COPY ./requirements-src.R .
RUN Rscript requirements-src.R
## This works
RUN R -e 'library(tidyverse)'
## Install Bioconductor packages
# Docker inheritance
FROM bioconductor/bioconductor_docker:RELEASE_3_12
COPY ./requirements-bioc.R .
#Don't bother running for speed but this will run
#RUN R -e 'BiocManager::install(ask = F)' && Rscript requirements-bioc.R
#This will fail - can't find the package
RUN R -e 'library(tidyverse)'
## Install from GH the following
#Don't bother running for speed but this will run
#RUN installGithub.r mojaveazure/loomR
EXPOSE 8787
## Make R the default
CMD [”R”]
requirements-src.R要求-src.R
pkgs <- c(
'spelling',
'english',
'DT'
)
install.packages(pkgs)
requirements-bioc.R要求-bioc.R
bioc_pkgs<-c(
'biomaRt',
'DropletUtils',
'rhdf5'
)
BiocManager::install(bioc_pkgs,ask=F)
Just in the interest of anyone else who is facing a similar problem, I will post my solution.只是为了其他面临类似问题的人的利益,我将发布我的解决方案。 I am not suggesting that this is the only solution so if others find better alternatives, I'll update to it.
我并不是说这是唯一的解决方案,所以如果其他人找到更好的选择,我会更新它。
In the end my approach to creating docker image which installs a lot of R CRAN and R Bioconductor packages as well as some R packages from Github was: In the end my approach to creating docker image which installs a lot of R CRAN and R Bioconductor packages as well as some R packages from Github was:
My solution uses this steps in this order and should prove as a fast and efficient solution (the use case for me was an R package which required >80 other packages from CRAN, Bioconductor and Github as dependencies. This solution reduced the runtime to a fraction of the original), Also, since we are using the latest version of Rocker RStudio and packages.我的解决方案按此顺序使用此步骤,并且应该证明是一种快速有效的解决方案(我的用例是 R package,它需要来自 CRAN、Bioconductor 和 ZE1ADBCBB92C622D0B3E619F 的 80 多个其他软件包作为运行时依赖项。原版),另外,由于我们使用的是最新版本的 Rocker RStudio 和软件包。 this should stay up-to-date with the latest versions of software and packages.
这应该与最新版本的软件和软件包保持同步。
The Dockerfile looks like this: Dockerfile 看起来像这样:
#LABEL maintainer="John Doe"
## Use rstudio installs binaries from RStudio's RSPM service by default,
## Uses the latest stable ubuntu, R and Bioconductor versions
FROM rocker/rstudio
## Add packages dependencies - from Bioconductor
RUN apt-get update \
&& apt-get install -y --no-install-recommends apt-utils \
&& apt-get install -y --no-install-recommends \
## Basic deps
gdb \
libxml2-dev \
python3-pip \
libz-dev \
liblzma-dev \
libbz2-dev \
libpng-dev \
libgit2-dev \
## sys deps from bioc_full
pkg-config \
fortran77-compiler \
byacc \
automake \
curl \
## This section installs libraries
libpcre2-dev \
libnetcdf-dev \
libhdf5-serial-dev \
libfftw3-dev \
libopenbabel-dev \
libopenmpi-dev \
libxt-dev \
libudunits2-dev \
libgeos-dev \
libproj-dev \
libcairo2-dev \
libtiff5-dev \
libreadline-dev \
libgsl0-dev \
libgslcblas0 \
libgtk2.0-dev \
libgl1-mesa-dev \
libglu1-mesa-dev \
libgmp3-dev \
libhdf5-dev \
libncurses-dev \
libbz2-dev \
libxpm-dev \
liblapack-dev \
libv8-dev \
libgtkmm-2.4-dev \
libmpfr-dev \
libmodule-build-perl \
libapparmor-dev \
libprotoc-dev \
librdf0-dev \
libmagick++-dev \
libsasl2-dev \
libpoppler-cpp-dev \
libprotobuf-dev \
libpq-dev \
libperl-dev \
## software - perl extensions and modules
libarchive-extract-perl \
libfile-copy-recursive-perl \
libcgi-pm-perl \
libdbi-perl \
libdbd-mysql-perl \
libxml-simple-perl \
libmysqlclient-dev \
default-libmysqlclient-dev \
libgdal-dev \
## new libs
libglpk-dev \
## Databases and other software
sqlite \
openmpi-bin \
mpi-default-bin \
openmpi-common \
openmpi-doc \
tcl8.6-dev \
tk-dev \
default-jdk \
imagemagick \
tabix \
ggobi \
graphviz \
protobuf-compiler \
jags \
## Additional resources
xfonts-100dpi \
xfonts-75dpi \
biber \
libsbml5-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
#install R CRAN binary packages
RUN install2.r -e \
testthat
## Install remaining packages from source
COPY ./requirements-src.R .
RUN Rscript requirements-src.R
## Install Bioconductor packages
COPY ./requirements-bioc.R .
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libfftw3-dev \
gcc && apt-get clean \
&& rm -rf /var/lib/apt/lists/*
RUN Rscript -e 'requireNamespace("BiocManager"); BiocManager::install(ask=F);' \
&& Rscript requirements-bioc.R
## Install from GH the following
RUN installGithub.r theislab/kBET \
chris-mcginnis-ucsf/DoubletFinder \
Note that the CRAN packages from source and the Bioconductor packages are held in separate scripts in the same folder as your Dockerfile.请注意,来自源的 CRAN 包和 Bioconductor 包保存在与 Dockerfile 相同的文件夹中的单独脚本中。
requirements-src.R:要求-src.R:
pkgs <- c(
'spelling',
'english',
'Seurat')
install.packages(pkgs)
requirements-bioc.R:要求-bioc.R:
bioc_pkgs<-c(
'biomaRt',
'SingleCellExperiment',
'SummarizedExperiment')
requireNamespace("BiocManager")
BiocManager::install(bioc_pkgs,ask=F)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.