[英]Importing R CRAN, Bioconductor and github R packages in one Dockerfile
我很抱歉,因為我認為這可能是一個簡單的問題,但我真的很難理解!
As a background, I am trying to create a Dockerfile
which installs a lot of R CRAN
and R Bioconductor
packages as well as some R packages from Github
. 我想盡快完成此操作,因此我使用rocker
的基礎映像來安裝二進制文件,請參閱此處以獲取出色的快速教程: https://datawookie.dev/blog/2019/01/docker-images -for-rr-base-vs-r-apt/
我的方法是首先將我所有必要的包安裝為二進制文件,如果有的話,從源代碼安裝它們。 之后,我使用Bioconductor
基礎鏡像安裝必要的Bioconductor
軟件包。
但是,在我導入Bioconductor
基礎鏡像后,我通過rocker
基礎鏡像安裝的軟件包不可用。 這是我覺得我對創建 Dockerfile 沒有清晰理解的地方,但我似乎無法在任何文檔中找到答案。 導入另一個圖像后有什么方法可以復制這些嗎? 我不知道這是否有必要,我看到其他人也是這樣做的,比如這里的問題海報: Minimizing the size of docker image R Z531704A02607A1646EFCF4C1FAE1EECZ6
需要注意的是,我導入了Bioconductor
基礎映像,因為我認為這將有助於處理依賴問題。 我想我可以安裝Bioconductor
軟件包,例如 R 軟件包,這些軟件包不能作為二進制文件提供,但我想盡可能快速和干凈地執行此操作,我認為這會減慢速度。
Essentially, I want to know what's the quickest way to install, R binaries, R non-binaries, R bioconductor and github packages all in one dockerfile.
下面是我的方法的一個示例,其中包含我需要的一小部分軟件包。 Note, I have shown my full approach to install R binaries, R non-binaries, R bioconductor and github packages but for the issue I am having see what happens to the tidyverse
R package before and after I import the Bioconductor
image; 調用library(tidyverse)
之前運行但之后失敗:
Dockerfile
## Use r-ubuntu, prev r-apt:bionic to enable the use of binary r packages for speed for R 4.0
FROM rocker/r-ubuntu:18.04
## Install available binaries - for speed
RUN apt-get update && \
apt-get install -y -qq \
r-cran-tidyverse \
r-cran-ids \
r-cran-snow
## Install remaining packages from source
COPY ./requirements-src.R .
RUN Rscript requirements-src.R
## This works
RUN R -e 'library(tidyverse)'
## Install Bioconductor packages
# Docker inheritance
FROM bioconductor/bioconductor_docker:RELEASE_3_12
COPY ./requirements-bioc.R .
#Don't bother running for speed but this will run
#RUN R -e 'BiocManager::install(ask = F)' && Rscript requirements-bioc.R
#This will fail - can't find the package
RUN R -e 'library(tidyverse)'
## Install from GH the following
#Don't bother running for speed but this will run
#RUN installGithub.r mojaveazure/loomR
EXPOSE 8787
## Make R the default
CMD [”R”]
要求-src.R
pkgs <- c(
'spelling',
'english',
'DT'
)
install.packages(pkgs)
要求-bioc.R
bioc_pkgs<-c(
'biomaRt',
'DropletUtils',
'rhdf5'
)
BiocManager::install(bioc_pkgs,ask=F)
只是為了其他面臨類似問題的人的利益,我將發布我的解決方案。 我並不是說這是唯一的解決方案,所以如果其他人找到更好的選擇,我會更新它。
In the end my approach to creating docker image which installs a lot of R CRAN and R Bioconductor packages as well as some R packages from Github was:
我的解決方案按此順序使用此步驟,並且應該證明是一種快速有效的解決方案(我的用例是 R package,它需要來自 CRAN、Bioconductor 和 ZE1ADBCBB92C622D0B3E619F 的 80 多個其他軟件包作為運行時依賴項。原版),另外,由於我們使用的是最新版本的 Rocker RStudio 和軟件包。 這應該與最新版本的軟件和軟件包保持同步。
Dockerfile 看起來像這樣:
#LABEL maintainer="John Doe"
## Use rstudio installs binaries from RStudio's RSPM service by default,
## Uses the latest stable ubuntu, R and Bioconductor versions
FROM rocker/rstudio
## Add packages dependencies - from Bioconductor
RUN apt-get update \
&& apt-get install -y --no-install-recommends apt-utils \
&& apt-get install -y --no-install-recommends \
## Basic deps
gdb \
libxml2-dev \
python3-pip \
libz-dev \
liblzma-dev \
libbz2-dev \
libpng-dev \
libgit2-dev \
## sys deps from bioc_full
pkg-config \
fortran77-compiler \
byacc \
automake \
curl \
## This section installs libraries
libpcre2-dev \
libnetcdf-dev \
libhdf5-serial-dev \
libfftw3-dev \
libopenbabel-dev \
libopenmpi-dev \
libxt-dev \
libudunits2-dev \
libgeos-dev \
libproj-dev \
libcairo2-dev \
libtiff5-dev \
libreadline-dev \
libgsl0-dev \
libgslcblas0 \
libgtk2.0-dev \
libgl1-mesa-dev \
libglu1-mesa-dev \
libgmp3-dev \
libhdf5-dev \
libncurses-dev \
libbz2-dev \
libxpm-dev \
liblapack-dev \
libv8-dev \
libgtkmm-2.4-dev \
libmpfr-dev \
libmodule-build-perl \
libapparmor-dev \
libprotoc-dev \
librdf0-dev \
libmagick++-dev \
libsasl2-dev \
libpoppler-cpp-dev \
libprotobuf-dev \
libpq-dev \
libperl-dev \
## software - perl extensions and modules
libarchive-extract-perl \
libfile-copy-recursive-perl \
libcgi-pm-perl \
libdbi-perl \
libdbd-mysql-perl \
libxml-simple-perl \
libmysqlclient-dev \
default-libmysqlclient-dev \
libgdal-dev \
## new libs
libglpk-dev \
## Databases and other software
sqlite \
openmpi-bin \
mpi-default-bin \
openmpi-common \
openmpi-doc \
tcl8.6-dev \
tk-dev \
default-jdk \
imagemagick \
tabix \
ggobi \
graphviz \
protobuf-compiler \
jags \
## Additional resources
xfonts-100dpi \
xfonts-75dpi \
biber \
libsbml5-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
#install R CRAN binary packages
RUN install2.r -e \
testthat
## Install remaining packages from source
COPY ./requirements-src.R .
RUN Rscript requirements-src.R
## Install Bioconductor packages
COPY ./requirements-bioc.R .
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libfftw3-dev \
gcc && apt-get clean \
&& rm -rf /var/lib/apt/lists/*
RUN Rscript -e 'requireNamespace("BiocManager"); BiocManager::install(ask=F);' \
&& Rscript requirements-bioc.R
## Install from GH the following
RUN installGithub.r theislab/kBET \
chris-mcginnis-ucsf/DoubletFinder \
請注意,來自源的 CRAN 包和 Bioconductor 包保存在與 Dockerfile 相同的文件夾中的單獨腳本中。
要求-src.R:
pkgs <- c(
'spelling',
'english',
'Seurat')
install.packages(pkgs)
要求-bioc.R:
bioc_pkgs<-c(
'biomaRt',
'SingleCellExperiment',
'SummarizedExperiment')
requireNamespace("BiocManager")
BiocManager::install(bioc_pkgs,ask=F)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.