简体   繁体   中英

How to induce correlations between two inverse cumulative probability distributions in [r]?

I'd like to create a correlated inverse cumulative distribution. Currently for example I have two inverse distributions shown as follows but would like to induce a correlation of say -0.5 for example. Is there a way I can achieve this?


library(lognorm)
library(dplyr)

Var_a <- tbl_df(qlnorm(runif(1000), meanlog = 0.0326, sdlog = 0.0288))
var_b <- tbl_df(qlnorm(runif(1000), meanlog = 0.0452, sdlog = 0.0364))

cor(Var_a, var_b)

Would the following work for you?

set.seed(100)
x1 <- rnorm(1000)
y1 <- rnorm(1000) - .6 * x1

x2 = pnorm(x1)
y2 = pnorm(y1)

cor(cbind(x2, y2))
#            x2         y2
# x2  1.0000000 -0.4995593
# y2 -0.4995593  1.0000000

Var_a <- tbl_df(qlnorm(x2, meanlog = 0.0326, sdlog = 0.0288))
var_b <- tbl_df(qlnorm(y2, meanlog = 0.0452, sdlog = 0.0364))

cor(Var_a, var_b)
#            value
# value -0.5239145

update: still confused about what you are doing but if you just want to apply what i've done to 15 variables do something like this maybe?

library(MASS)
sigma <- matrix(.5, nrow = 15, ncol = 15) + diag(15)*.5  #your correlation matrix
sigma
vars <- mvrnorm(1000, mu = rep(0, 15), Sigma = sigma)
vars
cor(vars)
vars2 <- pnorm(vars)
cor(vars2)
#use each of these as variable in qlnorm

vars2 <- data.frame(vars2)
names(vars2)
vars2

vars2[paste("log_", 1:15)] <- lapply(vars2[, 1:15], function(x) {qlnorm(x, meanlog = 0.0326, sdlog = 0.0288)})
names(vars2)
vars2 <- vars2[, -c(1:15)]
cor(vars2)

If you have 15 variables with a correlation matrix CC , you could use a Gaussian copula to get correlated uniform variates, using the Cholesky decomposition of CC , then invert those with your specified marginals as you did above. ( See here , for example).

nv <- NROW(CC)
num_samples <- 1000
A <- matrix(rnorm(num_samples * nv), ncol = nv)
U <- pnorm(A %*% chol(CC))

If your 15 variables have their means and standard deviations stored in vectors means and stdevs , you could do:

rv <- sapply(1:nv, function(i) qlnorm(U[,i], meanlog = means[i], sdlog = stdevs[i]))

The rv are your simulated variates with close to the desired correlation structure, which you can check with cor(rv) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM