简体   繁体   English

R 从 Shell 安装包

[英]R install packages from Shell

I am trying to implement a reducer for Hadoop Streaming using R. However, I need to figure out a way to access certain libraries that are not built in R, dplyr..etc.我正在尝试使用 R 为 Hadoop Streaming 实现一个 reducer。但是,我需要找到一种方法来访问某些不是在 R、dplyr 等中构建的库。 Based on my research seems like there are two approaches:根据我的研究,似乎有两种方法:

(1) In the reducer code, install the required libraries to a temporary folder and they will be disposed when the session is done, like this: (1) 在reducer 代码中,将需要的库安装到一个临时文件夹中,它们将在会话完成时被释放,如下所示:

.libPaths(c(.libPaths(), temp <- tempdir()))
install.packages("dplyr", lib=temp, repos='http://cran.us.r-project.org')
library(dplyr)
...

However, this approach will have a dramatic overhead depending on how many libraries you are trying to install.但是,这种方法将产生巨大的开销,具体取决于您尝试安装的库数量。 So most of the time will be wasted on installing libraries(sophisticated libraries like dplyr has tons of dependencies which will take minutes to install on a vanilla R session).所以大部分时间都会浪费在安装库上(像 dplyr 这样复杂的库有大量的依赖项,在 vanilla R 会话上安装需要几分钟的时间)。

So sounds like I need to install it before hand, which leads us to approach2.所以听起来我需要事先安装它,这导致我们接近2。

(2) My cluster is fairly big. (2) 我的集群相当大。 And I have to use some tool like Ansible to make it work.我必须使用像 Ansible 这样的工具才能让它工作。 So I prefer to have one Linux shell command to install the library.所以我更喜欢用一个 Linux shell 命令来安装库。 I have seen R CMD INSTALL... before, however, it feels like will only install packages from source file instead of doing install.packages() in R console, figure out the mirror, pull the source file, install it in one command.我之前见过R CMD INSTALL... ,但是,感觉只会从源文件安装包,而不是在 R 控制台中执行install.packages() ,找出镜像,拉取源文件,在一个命令中安装.

Can anyone show me how to use one command line in shell to non-interactively install a R package?谁能告诉我如何在 shell 中使用一个命令行来非交互式安装 R 包? (sorry for this much background knowledge, if anyone thinks I am not even following the right phylosophy, feel free to leave in the comment how this whole cluster R package should be managed.) (对于这么多背景知识很抱歉,如果有人认为我什至没有遵循正确的哲学,请随时在评论中留下应该如何管理整个集群 R 包。)

tl;dr tl;博士

Rscript -e 'install.packages("drat", repos="https://cloud.r-project.org")'

You mentioned you are trying to install dplyr into custom lib location on your disk.您提到您正在尝试将dplyr安装到磁盘上的自定义lib位置。 Be aware that dplyr package does not support that.请注意, dplyr包不支持。 You can read more in dplyr#4641 .您可以在dplyr#4641 中阅读更多内容


Moreover if you are installing private package published in internal CRAN-like repository (created by drat or tools::write_PACKAGES ), you can easily combine repos argument and resolve dependencies from CRAN automatically.此外,如果您正在安装刊登在内部专用包CRAN样库(由创建drattools::write_PACKAGES ),你可以很容易地结合repos从CRAN自动参数和依赖关系。

Rscript -e 'install.packages("priv.pkg", repos=c("cran.priv","https://cloud.r-project.org"))'

This is very handy feature of R repositories , although for production use I would recommend to cache packages from CRAN locally, and use those, so you will never be surprised by a breaking changes in your dependencies.这是R 存储库的非常方便的功能,尽管对于生产用途,我建议在本地缓存来自 CRAN 的包,并使用它们,因此您永远不会对依赖项的重大更改感到惊讶。 For quality information about handling R in production I suggest to look into talk by Wit Jakuczun at WhyR2019 How to make R great for machine learning in (not only) Enterprise : slides , video .有关在生产中处理 R 的质量信息,我建议查看 Wit Jakuczun 在 WhyR2019如何使 R 非常适合(不仅)企业中的机器学习幻灯片视频

You may find littler useful.您可能会发现利特勒有用。 It is a command-line front-end / variant of R (which uses the R-embedding interface).它是 R 的命令行前端/变体(使用 R 嵌入接口)。

I use the install.r script all the time to install package from the shell.我用的是install.r脚本从外壳安装包的时间 There is a second variant with more command-line argument parsing but it has an added dependency.还有第二个变体具有更多的命令行参数解析,但它具有附加的依赖性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM