简体   繁体   English

将包附加到 R 中的“临时”搜索路径

[英]attaching packages to a “temporary” search path in R

Inside a function, I am sourcing a script:在函数内部,我正在采购一个脚本:

f <- function(){
  source("~/Desktop/sourceme.R") # source someone elses script
  # do some stuff to the variables read in
}
f()
search() # library sourceme.R attaches is all the way in the back!

and unfortunately, the scripts that I am sourcing are not fully under my control.不幸的是,我采购的脚本并不完全在我的控制之下。 They make calls to library(somePackage) , and it pollutes the search path.他们调用library(somePackage) ,它会污染搜索路径。

This is mostly a problem if the author of sourceme.R expects the package that he/she is attaching to be at the top level/close to the global environment.如果sourceme.R的作者希望他/她附加的包位于顶级/接近全局环境,这主要是一个问题。 If I myself have attached some package that masks some of the function names he/she is expecting to be available, then that's no good.如果我自己附加了一些包来掩盖他/她期望可用的一些函数名称,那么这是不好的。

Is there a way I can source scripts but somehow make my own temporary search path that "resets" after the function is finished running?有没有办法可以获取脚本,但以某种方式使我自己的临时搜索路径在函数完成运行后“重置”?

I would consider sourcing the script in a separate R process using the callr package and then return the environment created by the sourced file.我会考虑使用callr包在单独的 R 进程中获取脚本,然后返回由源文件创建的环境。

By using a separate R process, this will prevent your search path from being polluted.通过使用单独的 R 进程,这将防止您的搜索路径受到污染。 I'm guessing there maybe some side effects (such as defining new functions of variables) in your global environment you do want.我猜在您确实想要的全局环境中可能存在一些副作用(例如定义变量的新函数)。 The local argument of the source functions allows you to specify where the parsed script should be executed. source函数的local参数允许您指定解析脚本的执行位置。 If you return this environment from the other R process, you can access any result you need.如果您从其他 R 进程返回此环境,则可以访问您需要的任何结果。

Not sure what yours looks like but say I have this file that would modify the search path:不确定你的样子,但说我有这个文件可以修改搜索路径:

# messWithSearchPath.R

library(dplyr)

a <- data.frame(groupID = rep(1:3, 10), value = rnorm(30))

b <- a %>% 
  group_by(groupID) %>% 
  summarize(agg = sum(value))

From my top level script, I would write a wrapper function to source it in a new environment and have callr execute this function:从我的顶级脚本中,我将编写一个包装函数以在新环境中获取它并让callr执行此函数:

RogueScript <- function(){
  
  rogueEnv <- new.env()
  
  source("messWIthSearchPath.R", local = rogueEnv)
  
  rogueEnv
  
}

before <- search()

scriptResults <- callr::r(RogueScript)

scriptResults$b
#>   groupID       agg
#> 1       1 -2.871642
#> 2       2  3.368499
#> 3       3  1.159509

identical(before, search())
#> [1] TRUE

If the scripts have other side effects (such as setting options or establishing external connections), this method probably won't work.如果脚本有其他副作用(例如设置选项或建立外部连接),则此方法可能不起作用。 There may be workarounds depending on what they are intended to do, but this should work if you just want the variables/functions created.可能有一些解决方法,具体取决于它们打算做什么,但如果您只想创建变量/函数,这应该可行。 It also prevents the scripts from conflicting with each other not just your top level script.它还可以防止脚本相互冲突,而不仅仅是顶级脚本。

One way would be to "snapshot" your current search path and try to return to it later:一种方法是“快照”您当前的搜索路径,然后尝试返回到它:

search.snapshot <- local({
  .snap <- character(0)
  function(restore = FALSE) {
    if (restore) {
      if (is.null(.snap)) {
        return(character(0))
      } else {
        extras <- setdiff(search(), .snap)
        # may not work if DLLs are loaded
        for (pkg in extras) {
          suppressWarnings(detach(pkg, character.only = TRUE, unload = TRUE))
        }
        return(extras)
      }
    } else .snap <<- search()
  }
})

In action:在行动:

search.snapshot()                                  # store current state
get(".snap", envir = environment(search.snapshot)) # view snapshot
#  [1] ".GlobalEnv"        "ESSR"              "package:stats"    
#  [4] "package:graphics"  "package:grDevices" "package:utils"    
#  [7] "package:datasets"  "package:r2"        "package:methods"  
# [10] "Autoloads"         "package:base"     
library(ggplot2)
library(zoo)
# Attaching package: 'zoo'
# The following objects are masked from 'package:base':
#     as.Date, as.Date.numeric
library(dplyr)
# Attaching package: 'dplyr'
# The following objects are masked from 'package:stats':
#     filter, lag
# The following objects are masked from 'package:base':
#     intersect, setdiff, setequal, union
search()
#  [1] ".GlobalEnv"        "package:dplyr"     "package:zoo"      
#  [4] "package:ggplot2"   "ESSR"              "package:stats"    
#  [7] "package:graphics"  "package:grDevices" "package:utils"    
# [10] "package:datasets"  "package:r2"        "package:methods"  
# [13] "Autoloads"         "package:base"     

search.snapshot(TRUE)                              # returns detached packages
# [1] "package:dplyr"   "package:zoo"     "package:ggplot2"

search()
#  [1] ".GlobalEnv"        "ESSR"              "package:stats"    
#  [4] "package:graphics"  "package:grDevices" "package:utils"    
#  [7] "package:datasets"  "package:r2"        "package:methods"  
# [10] "Autoloads"         "package:base"     

I am somewhat confident (without verification) that this will not always work with all packages, perhaps due to dependencies and/or loaded DLLs.我有点自信(未经验证)这并不总是适用于所有包,可能是由于依赖项和/或加载的 DLL。 You can try adding force=TRUE to the detach call, not sure if that'll work better or perhaps have other undesirable side-effects.您可以尝试将force=TRUE添加到detach调用中,不确定这是否会更好,或者可能有其他不良副作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM