简体   繁体   中英

how to generate random numbers with conditons impose in R?

I would like to generate 500 different combination of a,b,and c meeting the following conditions

  1. a+ b+ c = 1 and
  2. a < b < c

here is a basic sample of generating random numbers, however, I need to generate it based on aforementioned conditions.

Coeff = data.frame(a=runif(500, min = 0, max = 1),
b=runif(500, min = 0, max = 1),
c=runif(500, min = 0, max = 1))
myrandom <- function(n) {
  m <- matrix(runif(3*n), ncol=3)
  m <- cbind(m, rowSums(m)) # rowSums is efficient
  t(apply(m, 1, function(a) sort(a[1:3] / a[4])))
}

Demonstration:

set.seed(2)
(m <- myrandom(5))
#           [,1]      [,2]      [,3]
# [1,] 0.1099815 0.3287708 0.5612477
# [2,] 0.1206611 0.2231769 0.6561620
# [3,] 0.2645362 0.3509054 0.3845583
# [4,] 0.2057215 0.2213517 0.5729268
# [5,] 0.2134069 0.2896015 0.4969916
all(abs(rowSums(m) - 1) < 1e-8) # CONSTRAINT 1: a+b+c = 1
# [1] TRUE
all(apply(m, 1, diff) > 0)      # CONSTRAINT 2: a < b < c
# [1] TRUE

Note:

  • my test for "sum to 1" is more than just ==1 because of IEEE-754 and R FAQ 7.31 , suggesting that any floating-point test should be an inequality vice a test for equality; if you test for ==1 , you will eventually find occurrences where it does not appear to be satisfied:

     set.seed(2) m <- myrandom(1e5) head(which(rowSums(m) != 1)) # [1] 73 109 199 266 367 488 m[73,] # [1] 0.05290744 0.24824770 0.69884486 sum(m[73,]) # [1] 1 sum(m[73,]) == 1 # [1] FALSE abs(sum(m[73,]) - 1) < 1e-15 # [1] TRUE max(abs(rowSums(m) - 1)) # [1] 1.110223e-16 

I would like to point out that ANY distribution law (uniform, gaussian, exponential, ...) will produce numbers a , b and c meeting your condition as soon as you normalize and sort them, so there should be some domain knowledge to prefer one over the other.

As an alternative, I would propose to use Dirichlet distribution which produce numbers naturally satisfying your first condition: a+b+c=1. It was applied to rainfall modelling as well, I believe ( https://arxiv.org/pdf/1801.02962.pdf )

library(MCMCpack)
abc <- rdirichlet(n, c(1,1,1))
sum(abc) # should output n

You could vary power law values to shape the data, and, of course, sort them to satisfy your second condition. For many cases it is easy to argue about your model behavior if it uses Dirichlet (Dirichlet being prior for multinomial in Bayes approach, fe)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM