[英]r - Sampling from a grid of probabilities (Bayesian posterior approximation)
I am doing a Bayesian analysis, and I am trying to estimate two parameters. 我正在做贝叶斯分析,并且正在尝试估计两个参数。 To approximate the posterior distribution, I have constructed a fine grid and computed the posterior probability for each element in the grid.
为了近似后验分布,我构造了一个精细网格并计算了网格中每个元素的后验概率。 I normalized it so that the grid sums to 1.
我对其进行了归一化,以使网格总和为1。
Now I am interested in sampling from the distribution. 现在,我对从分发中进行抽样感兴趣。 This is what I have so far:
这是我到目前为止的内容:
sampleGrid <- function(post.grid, mu.grid, sig2.grid) {
value <- sample(post.grid, 1, prob=post.grid)
index <- which(post.grid == value)
col <- as.integer(index/nrow(post.grid))+1
row <- index-(col-1)*nrow(post.grid)
return(c(mu.grid[row], sig2.grid[col]))
}
However, I run into problems with runtime when I want to sample a lot because I use a for loop: 但是,当我想进行大量采样时会遇到运行时问题,因为我使用了for循环:
for(i in 1:nrow(sample.grid)) {
sample.grid[i, ] <- sampleFromGrid(post.grid, mu.grid, sig2.grid)
}
I was wondering if there was a way to vectorize this. 我想知道是否有一种矢量化方法。 My attempt was:
我的尝试是:
vectorizedSampleFromGrid <- function(post.grid, mu.grid, sig2.grid, n){
values <- sample(post.grid, n, replace=T, prob=post.grid)
index <- which(post.grid %in% values)
if(length(values)!=length(index)) {
temp.df <- count(values)
index <- which(post.grid %in% temp.df[,1])
temp.df <- cbind(temp.df, index)
temp.df <- temp.df[temp.df[, 2] > 1, ]
for(i in 1:nrow(temp.df)) {
index <- c(index, rep(temp.df[i, 3], temp.df[i,2]-1))
}
}
col <- as.integer(index/nrow(post.grid))+1
row <- index-(col-1)*nrow(post.grid)
return(cbind(mu.grid[row], sig2.grid[col]))
}
I know that some elements will be sampled more than once. 我知道某些元素将被多次采样。 What I am trying to do is append those indexes multiple times to the original index list based on how many times they were sampled.
我要尝试的是根据采样次数将这些索引多次添加到原始索引列表中。 However when I do this the result is not correct.
但是,当我这样做时,结果是不正确的。
If anyone can offer any advice, I would greatly appreciate it. 如果有人可以提供任何建议,我将不胜感激。
Here is what I would do. 这就是我要做的。 Create a vectorized function to evaluate the posterior (or at least something that is proportional to it):
创建一个矢量化函数来评估后验(或至少与其成正比的东西):
f = function(mu, sigma, log=TRUE) {
logf = dnorm(mu, 0, sigma, log=TRUE) + dgamma(sigma, 1, 1, log=TRUE)
if (log) return(logf)
return(exp(f))
}
Now evaluate this function on a grid. 现在在网格上评估此功能。
library(dplyr)
grid = mutate(expand.grid(mu=seq(-3,3,1), sigma=seq(1,7,1)),
logp = f(mu,sigma),
logp = logp-max(logp), # for numerical stability
p = exp(logp),
p = p/sum(p)) # Normalize
Now obtain samples from this grid: 现在从此网格中获取样本:
samples = sample_n(grid, size=100, replace=TRUE, weight=grid$p)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.