简体   繁体   English

更新现有的Rdata文件

[英]Updating an existing Rdata file

I have found myself in the position of needing to update one or two data objects in an Rdata file previously created using save . 我发现自己处于需要更新先前使用save创建的Rdata文件中的一个或两个数据对象的位置。 If I'm not careful to load the file I can forget to re-save some objects in the file. 如果我不小心加载文件,我可能会忘记在文件中重新保存一些对象。 As an example, I'm working on a package with some objects stored in sysdata.rda (look-up tables for internal use which I do not want to export) and only want to worry about updating individual objects. 作为一个例子,我正在处理一个包含一些存储在sysdata.rda对象(内部使用的查找表,我不想导出),只想担心更新单个对象。

I haven't managed to work out if there is a standard way to do this, so created my own function. 如果有一种标准的方法可以做到这一点,我还没有成功,所以创建了我自己的功能。

resave <- function (..., list = character(), file = stop("'file' must be specified")) {
  # create a staging environment to load the existing R objects
  stage <- new.env()
  load(file, envir=stage)
  # get the list of objects to be "resaved"
  names <- as.character(substitute(list(...)))[-1L]
  list <- c(list, names)
  # copy the objects to the staging environment
  lapply(list, function(obj) assign(obj, get(obj), stage))
  # save everything in the staging environment
  save(list=ls(stage, all.names=TRUE), file=file)
}

It does seem like overkill though. 虽然看起来有点矫枉过正。 Is there a better/easier way to do this? 有没有更好/更简单的方法来做到这一点?

As an aside, am I right in assuming that a new environment created in the scope of a function is destroyed after the function call? 顺便说一下,我是否正确地假设在函数调用之后在函数范围内创建的新环境被销毁?

Here is a slightly shorter version: 这是一个稍短的版本:

resave <- function(..., list = character(), file) {
   previous  <- load(file)
   var.names <- c(list, as.character(substitute(list(...)))[-1L])
   for (var in var.names) assign(var, get(var, envir = parent.frame()))
   save(list = unique(c(previous, var.names)), file = file)
}

I took advantage of the fact the load function returns the name of the loaded variables, so I could use the function's environment instead of creating one. 我利用了load函数返回已加载变量的名称这一事实,因此我可以使用函数的环境而不是创建一个。 And when using get , I was careful to only look in the environment from which the function is called, ie parent.frame() . 当使用get ,我小心翼翼地只查看调用函数的环境,即parent.frame()

Here is a simulation: 这是一个模拟:

x1 <- 1
x2 <- 2
x3 <- 3
save(x1, x2, x3, file = "abc.RData")

x1 <- 10
x2 <- 20
x3 <- 30
resave(x1, x3, file = "abc.RData")

load("abc.RData")
x1
# [1] 10
x2
# [1] 2
x3
# [1] 30

I have added a refactored version of @flodel's answer in the stackoverflow package. 我在stackoverflow包中添加了重构版本的@ flodel的答案。 It uses environments explicitly to be a bit more defensive. 它明确地使用环境更具防御性。

resave <- function(..., list = character(), file) {
  e <- new.env()
  load(file, e)
  list <- union(list, as.character(substitute((...)))[-1L])
  copyEnv(parent.frame(), e, list)
  save(list = ls(e, all.names=TRUE), envir = e, file = file)
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM