简体   繁体   English

从均匀分布的混合中绘制随机数

[英]Draw random numbers from the mixture of uniform distributions

Goal 目标

I am trying to build a function that draws a specific number of random numbers from an an "incomplete uniform distribution". 我正在尝试构建一个从“不完全均匀分布”中抽取特定数量的随机数的函数。

What do I call an incomplete uniform distribution? 我称之为不完整的均匀分布?

I call an incomplete uniform distribution a probability distribution where each value of X within a series of boundaries have an equal probability to be picked. 我将不完整的均匀分布称为概率分布,其中一系列边界内的每个X值具有相同的拾取概率。 In other words it is a uniform distribution with holes (where the probability is zero) as represented below 换句话说,它是具有孔的均匀分布(其中概率为零),如下所示

x = list(12:25, 34:54, 67:90, 93:115)
y = 1/sum(25-12, 54-34, 90-67, 115-93)
plot(y=rep(y, length(unlist(x))), x=unlist(x), type="n", ylab="Probability", xlab="X")
for (xi in x)
{
    points(xi,rep(y, length(xi)), type="l", lwd=4)
}

在此输入图像描述

Ugly Solution 难看的解决方案

Here is a slow and ugly solution 这是一个缓慢而丑陋的解决方案

IncompleteUnif = function(n,b)
{
    #################
    # "n" is the desired number of random numbers
    # "b" is a list describing the boundaries within which a random number can possibly be drawn.
    #################
    r = c() # Series of random numbers to return
    for (ni in n)
    {
        while (length(r) < n) # loop will continue until we have the "n" random numbers we need
        {
            ub = unlist(b)
            x = runif(1,min(ub), max(ub)) # one random number taken over the whole range
            for (bi in b) # The following loop test if the random number is withinn the ranges specified by "b"
            {
                if (min(bi) < x & max(bi) > x) # if found in one range then just add "x" to "r" and break
                {
                    r = append(r,x)
                    break
                }
            }
        }
    }
    return (r)
}

b = list(c(5,94),c(100,198),c(220,292), c(300,350))
set.seed(12)
IncompleteUnif(10,b)
[1]  28.929516 287.132444 330.204498  63.425103  16.693990  66.680826 226.374551  12.892821   7.872065 140.480533

Your incomplete uniform distribution can be represented as a mixture of four ordinary uniform distributions, with mixing weight for each segment proportional to its length (ie, so that the longer the segment, the more weight it has). 您的不完全均匀分布可以表示为四个普通均匀分布的混合,每个分段的混合权重与其长度成比例(即,使分段越长,其具有越多的重量)。

To sample from such a distribution, first choose a segment (taking the weights into account). 要从这样的分布中进行采样,首先选择一个分段(考虑权重)。 Then from the chosen segment, choose an element. 然后从选定的段中选择一个元素。

I believe this works, using the algorithm suggested by Robert Dodier: 我相信这是有效的,使用Robert Dodier建议的算法:

rmixunif = function(n, b) {
    subdists = sample(seq_along(b), size = n, replace = T, prob = sapply(b, diff))
    subdists_n = tabulate(subdists)
    draw = numeric(n)
    for (i in unique(subdists)) {
        draw[subdists == i] = runif(subdists_n[i], min = b[[i]][1], max = b[[i]][2])
    }
    return(draw)
}

rmixunif(10, b = list(c(5,94),c(100,198),c(220,292), c(300,350)))
#  [1]  64.85989  85.33292 235.39607 233.40133 240.28686  67.21626 237.60248  11.80377 151.65365 306.44473

I like Sam Dickson's histogram visual check, here's my version: 我喜欢Sam Dickson的直方图视觉检查,这是我的版本:

x <- rmixunif(10000,list(c(0,1),c(2.5,3),c(6,10)))
hist(x,breaks=20)

在此输入图像描述

It could be fancied up with some input-checking (and maybe an mapply like suggested in the comments), but I'll leave that to others. 它可以通过一些输入检查(可能是评论中建议的mapply )来表达,但我会将其留给其他人。

Thanks to alexis_iaz for the tabulate() suggestion! 感谢alexis_iaz的tabulate tabulate()建议!

Slightly convoluted version of the @Gregor's solution. @Gregor解决方案稍有复杂的版本。

mix_unif <- function(n, b){
  x <- c()
  ns <- rmultinom(1, n, sapply(b, diff))
  for (i in seq_along(ns)) {
    x <- c(x, runif(ns[i], b[[i]][1], b[[i]][2]))
  }
  x
}

microbenchmark(mix_unif(1e5, b),
               rmixunif(1e5, b),
               IncompleteUnif(1e5, b), 
               unit="relative")
Unit: relative
                     expr      min       lq     mean   median       uq      max neval
       mix_unif(1e+05, b) 1.000000 1.000000 1.000000 1.000000 1.000000  1.00000   100
       rmixunif(1e+05, b) 3.123515 3.235961 3.750369 3.496843 3.462529 15.73449   100
 IncompleteUnif(1e+05, b) 6.806916 7.247425 6.926282 7.188556 7.093928 18.20041   100

Another solution is to transform the output. 另一种解决方案是转换输出。 The idea is to sample from a random uniform distribution, then apply a conditional transformation so that the numbers fall only in the selected ranges: 我们的想法是从随机均匀分布中进行采样,然后应用条件转换,使数字仅落在所选范围内:

IncompleteUnif = function(n,b) {
  widths <- cumsum(sapply(b,diff))
  x <- runif(n,0,tail(widths,1))
  out <- x
  out[x<=widths[1]] <- x[x<=widths[1]] + b[[1]][1]
  for(i in 2:length(b)) {
    out[widths[i-1]<x & x<=widths[i]] <- x[widths[i-1]<x & x<=widths[i]] - widths[i-1] + b[[i]][1]
  }
  return(out)
}

x <- IncompleteUnif(10000,list(c(0,1),c(2.5,3),c(6,10)))
hist(x,breaks=20)

在此输入图像描述

I'm a few years late to the party, but seeing that there wasn't a solution without an explicit loop yet, here's one such implementation (following @RobertDodier's approach): 我参加聚会已经晚了几年,但是看到没有明确循环的解决方案,这里有一个这样的实现(遵循@ RobertDodier的方法):

rmunif <- function(n, b) {
  runifb <- function(n, b) runif(n, b[1], b[2])
  ns <- rmultinom(1, n, vapply(b, diff, 1))
  unlist(Map(runifb, ns, b), use.names = FALSE)
}

hist(rmunif(1e5, list(0:1, c(5, 8), 9:10)))

library(microbenchmark)

set.seed(2018)
n <- 1e5

microbenchmark(
  rmunif(n, b),
  mix_unif(n, b),
  rmixunif(n, b),
  IncompleteUnif(n, b),
  unit = "relative"
) -> mb

print(mb, signif = 5)
#> Unit: relative
#>                  expr    min     lq   mean median     uq    max neval
#>          rmunif(n, b) 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000   100
#>        mix_unif(n, b) 1.1181 1.1256 1.1281 1.1728 1.1236 1.0476   100
#>        rmixunif(n, b) 2.7822 2.8982 2.9899 2.7850 2.8345 1.3970   100
#>  IncompleteUnif(n, b) 4.4922 4.7089 5.2732 4.5764 8.4317 2.4364   100

Created on 2018-03-11 by the reprex package (v0.2.0). reprex包 (v0.2.0)于2018-03-11创建。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM