[英]How to submit a "vector of distributions" to a function in R?
I want to write an R-function, say f
, which has inputs x
and n
, where x
is some kind of "list of distributions" and f
is supposed to draw n
samples from each distribution in x
.我想写一个 R 函数,比如f
,它有输入x
和n
,其中x
是某种“分布列表”, f
应该从x
中的每个分布中抽取n
样本。
What is a good way to implement this in R?在 R 中实现这一点的好方法是什么?
My current idea is我目前的想法是
f = function(x,n){
out = list()
for(i in 1:length(x)){
name = sub("\\(.*", "",x[i])
size = ifelse(name=="sample",paste("size=",n),paste0("n=",n))
args = paste(size,gsub("[\\(\\)]", "", regmatches(x[i], gregexpr("\\(.*?\\)", x[i]))[[1]]),sep=",")
out[[i]] = eval(parse(text=paste0(name,"(",args,")")))
}
return(out)
}
f(x = c("rnorm(mean=1,sd=2)","sample(0:1,replace=TRUE)","rbinom(size=10,prob=0.1)"), n = 10)
I don't like this implementation, because我不喜欢这个实现,因为
n
is not always the input name for the sample size (eg in sample
it is size
), n
并不总是样本大小的输入名称(例如,在sample
中它是size
), Can I improve the implementation, for example with x
of class alist?我可以改进实现,例如使用 alist 类的x
吗?
You could change your input and create a list of function names and arguments.您可以更改输入并创建函数名称和参数列表。 For each distribution we set the n
/ size
-value to 1
.对于每个分布,我们将n
/ size
值设置为1
。
ls_func <- list("rnorm" = list(mean = 0, sd = 1, n = 1),
"sample" = list(x = 0:1, replace = TRUE, size = 1),
"rbinom" = list(size = 10, prob = 0.1, n = 1))
Your function takes those distributions and replicates them n
-times:您的函数采用这些分布并将它们复制n
:
g <- function(ls_func, n) {
out = list()
for(i in seq_along(ls_func)){
out[[i]] <- replicate(do.call(names(ls_func)[i], ls_func[[i]]), n = n)
}
return(out)
}
so所以
set.seed(4096)
g(ls_func, 10)
returns返回
[[1]]
[1] 0.1894398 -0.1622468 0.5327100 -1.5747229 -0.6884024 -0.3092226 -0.0879258 -0.4320240 -0.7799596 0.4525895
[[2]]
[1] 0 1 0 0 0 1 1 1 0 0
[[3]]
[1] 0 0 1 1 0 1 1 1 1 0
or.要么。 Basically it's not a good approach to use eval(parse(text=...))
to execute functions.基本上,使用eval(parse(text=...))
来执行函数并不是一个好方法。 Use do.call
instead.改用do.call
。
You can remove the for
-loop:您可以删除for
循环:
g <- function(ls_func, n) {
out = list()
out <- lapply(seq_along(ls_func), function(i) replicate(do.call(names(ls_func)[i], ls_func[[i]]), n = n))
return(out)
}
Note: This code also crashes, if your distributions aren't defined properly.注意:如果您的分布定义不正确,此代码也会崩溃。 To avoid this, you need some error handling.为了避免这种情况,您需要进行一些错误处理。 Look for try
and stop
functions.寻找try
和stop
功能。
I've been putting together an R package -- distionary -- that can help with this.我一直在整理一个 R 包 - distionary - 可以帮助解决这个问题。
First make a list of input distributions:首先列出输入分布:
library(distionary)
x <- list(
dst_norm(1, 2^2),
dst_empirical(0:1),
dst_binom(10, 0.1)
)
The function for drawing from a distribution is realize()
, which fits nicely with lapply()
(or purrr's map()
):从分布中绘制的函数是 implement( realize()
,它非常适合lapply()
(或 purrr 的map()
):
set.seed(123)
lapply(x, realize, n = 10)
#> [[1]]
#> [1] -0.1209513 0.5396450 4.1174166 1.1410168 1.2585755 4.4301300
#> [7] 1.9218324 -1.5301225 -0.3737057 0.1086761
#>
#> [[2]]
#> [1] 0 0 0 0 0 0 0 0 1 1
#>
#> [[3]]
#> [1] 3 2 1 2 0 1 2 0 0 0
Putting this code in a function is then straightforward:将此代码放入函数中很简单:
f <- function(x, n) {
lapply(x, realize, n = n)
}
set.seed(123)
f(x, n = 10)
#> [[1]]
#> [1] -0.1209513 0.5396450 4.1174166 1.1410168 1.2585755 4.4301300
#> [7] 1.9218324 -1.5301225 -0.3737057 0.1086761
#>
#> [[2]]
#> [1] 0 0 0 0 0 0 0 0 1 1
#>
#> [[3]]
#> [1] 3 2 1 2 0 1 2 0 0 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.