简体   繁体   English

R package 提交错误有关 set.seed()

[英]R package submission error concerning set.seed()

I recently submitted a package to CRAN that passed all the automatic checks, but failed passing the manual ones.我最近向 CRAN 提交了一个 package,它通过了所有自动检查,但未能通过手动检查。 One of the errors were the following:错误之一如下:

Please do not set a seed to a specific number within a function.请不要将种子设置为 function 中的特定数字。

Please do not modifiy the.GlobalEnv.请不要修改.GlobalEnv。 This is not allowed by the CRAN policies. CRAN 政策不允许这样做。

I believe the lines of code that these comments are referring to are the following我相信这些评论所指的代码行如下

    if(simul == TRUE){

        set.seed(42)

    }

    w <- matrix(data = rbinom(n = p, size = 1, prob = 0.5), ncol = 1)

    beta <- w*beta-(1-w)*beta

    s <- round((1-sparsity)*p)                  

    toReplace <- sample(p, size = s)

    beta <- replace(beta, list = toReplace, values = 0)


    # Generate the random p-columned matrix of indicator series. 

    X <- matrix(data = rnorm ((n_l*m) * p, mean = mean_X, sd = sd_X), ncol = p, nrow = n_l*m)


    if(simul == TRUE){

        rm(.Random.seed, envir = globalenv())

    }

Essentially, I am allowing the function to include a simulations option "simul", such that when set to "TRUE", a matrix "X" and a vector of coefficients "beta" remain fixed.本质上,我允许 function 包含一个模拟选项“simul”,这样当设置为“TRUE”时,矩阵“X”和系数向量“beta”保持固定。 I remove the seed at the end of this segment (final lines), as the rest of the code contains variables that should change at each iteration of the simulation.我删除了该段末尾的种子(最后几行),因为代码的 rest 包含应在每次模拟迭代时更改的变量。 However, as noted in the feedback from CRAN, this is not allowed.但是,正如 CRAN 的反馈中所述,这是不允许的。 What is an alternative way to go about this? go 的替代方法是什么? I cannot set a fixed vector "beta" or matrix "X" when "simul" is "TRUE", since the dimension of these are inputs to the function and thus vary depending on the preferences of the investigator.当“simul”为“TRUE”时,我无法设置固定向量“beta”或矩阵“X”,因为这些维度是 function 的输入,因此取决于调查人员的偏好。

If you really, really, want to set the seed inside a function, which I believe you nor anyone should do, save the current seed, do whatever you want, and before exiting the function reset it to the saved value.如果你真的,真的,想在 function 中设置种子,我相信你和任何人都应该这样做,保存当前的种子,做任何你想做的事情,然后在退出 function 之前将其重置为保存的值。

old_seed <- .Random.seed
rnorm(1)
#[1] -1.173346

set.seed(42)
rbinom(1, size = 1, prob = 0.5)
#[1] 0

.Random.seed <- old_seed
rnorm(1)
#[1] -1.173346

In a function it could be something like the following, without the message instructions.在 function 中,它可能类似于以下内容,没有message说明。 Note that the function prints nothing, it never calls any pseudo-RNG and always outputs TRUE .请注意, function 不打印任何内容,它从不调用任何伪 RNG 并且始终输出TRUE The point is to save the seed's current value and reset the seed in on.exit .关键是保存种子的当前值并在on.exit中重置种子。

f <- function(simul = FALSE){
  if(simul){
    message("simul is TRUE")
    old_seed <- .Random.seed
    on.exit(.Random.seed <- old_seed)
    # rest of code
  } else message("simul is FALSE")
  invisible(TRUE)
}

f()
s <- .Random.seed
f(TRUE)
identical(s, .Random.seed)
#[1] TRUE

rm(s)

When you fix the seed, if the user try this code with the same parameters, the same results will be obtained each time.当你修复种子时,如果用户用相同的参数尝试这段代码,每次都会得到相同的结果。

Supposing that this chunk of code is inside a larger chunk related only to the simulation, just get rid of the setseed() and try something like that:假设这段代码位于一个仅与模拟相关的较大块中,只需摆脱setseed()并尝试类似的操作:

if(simul == TRUE){
    w <- matrix(data = rbinom(n = p, size = 1, prob = 0.5), ncol = 1)
    beta <- w*beta-(1-w)*beta
    s <- round((1-sparsity)*p)                  
    toReplace <- sample(p, size = s)
    beta <- replace(beta, list = toReplace, values = 0)

    # Generate the random p-columned matrix of indicator series. 
    X <- matrix(data = rnorm ((n_l*m) * p, mean = mean_X, sd = sd_X), ncol = p, nrow = n_l*m)
}

A similar question has been asked on the Bio devel mailing list.Bio devel邮件列表中也提出了类似的问题。 The suggestion there was to use the functionality of withr::with_seed .那里的建议是使用withr::with_seed的功能。 Your code could then become:您的代码可能会变成:

library(withr)

if(simul == TRUE){
  w <- with_seed(42, matrix(data = rbinom(n = p, size = 1, prob = 0.5), ncol = 1))
} else {
  w <- matrix(data = rbinom(n = p, size = 1, prob = 0.5), ncol = 1)
  
} 


beta <- w*beta-(1-w)*beta

s <- round((1-sparsity)*p)                  

toReplace <- sample(p, size = s)

beta <- replace(beta, list = toReplace, values = 0)


# Generate the random p-columned matrix of indicator series. 

X <- matrix(data = rnorm ((n_l*m) * p, mean = mean_X, sd = sd_X), ncol = p, nrow = n_l*m)

Of course that raises the question of how withr got on CRAN, given that it appears to do the same thing that you're being told not to do - the difference may be that your version may overwrite an existing seed, whereas that code checks whether a seed already exists.当然,这引发了一个问题,即withr是如何进入 CRAN 的,因为它似乎做了与你被告知不要做的事情相同的事情——不同之处可能是你的版本可能会覆盖现有的种子,而该代码检查是否种子已经存在。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM