简体   繁体   English

如何在 R 中生成多元正态数据?

[英]How to generate multivariate normal data in R?

I'm completing an assignment, in which I have to generate a sample X = (X1, X2) from a bivariate normal in which each marginal is N(0,1) and the correlation between X1 and X2 is 0.5.我正在完成一项作业,其中我必须从双变量法线生成样本 X = (X1, X2),其中每个边际为 N(0,1),X1 和 X2 之间的相关性为 0.5。

I think the way to approach this is to use the mvrnorm function, but I'm not quite sure how to proceed after that.我认为解决这个问题的方法是使用 mvrnorm 函数,但我不太确定之后如何进行。 Any advice?有什么建议吗? Thanks in advance!提前致谢!

Indeed, the mvrnorm function from the MASS package is probably your best bet.事实上,MASS 包中的mvrnorm函数可能是你最好的选择。 This function can generate pseudo-random data from multivariate normal distributions.该函数可以从多元正态分布生成伪随机数据。

Examining the help page for this function ( ??mvrnorm ) shows that there are three key arguments that you would need to simulate your data based your given parameters, ie:检查此函数 ( ??mvrnorm ) 的帮助页面表明,您需要三个关键参数来模拟基于给定参数的数据,即:

  • n - the number of samples required (an integer); n - 所需样本数(整数);
  • mu - a vector giving the means of the variables - here, your distributions are standard normal so it will be a vector of zeros; mu - 给出变量均值的向量 - 在这里,您的分布是标准正态分布,因此它将是一个零向量; and
  • Sigma - a positive-definite symmetric matrix specifying the covariance matrix of the variables - ie, in your case, a matrix with variance on the diagonal of ones and covariance on the off-diagonals of 0.5. Sigma - 一个正定对称矩阵,指定变量的协方差矩阵 - 即,在您的情况下,一个矩阵的对角线上的方差和非对角线上的协方差为 0.5。

Have a look at the examples in this help page, which should help you put these ideas together!看看这个帮助页面中的例子,它应该可以帮助你把这些想法放在一起!

Here are some options:以下是一些选项:

  1. mvtnorm::rmvnorm and MASS::mvrnorm work the same way, although the mvtnorm::rmvnorm function does not require that you specify the means (ie, the default is 0). mvtnorm::rmvnormMASS::mvrnorm的工作方式相同,尽管mvtnorm::rmvnorm函数不需要您指定均值(即默认值为 0)。 Giving names to the mu vector will specify the names of the simulated variables.mu向量命名将指定模拟变量的名称。
n <- 100
R <- matrix(c(1, 0.5,
              0.5, 1), 
            nrow = 2, ncol = 2, byrow = TRUE)
            
mu <- c(X = 0, Y = 0)
mvtnorm::rmvnorm(n, mean = mu, sigma = R)
MASS::mvrnorm(n, mu = mu, Sigma = R)
  1. simstandard::sim_standardized will make standardized data only, but will do so with less typing: simstandard::sim_standardized将只生成标准化数据,但输入更少:
simstandard::sim_standardized("X ~~ 0.5 * Y", n = 100)

Using base R (no package needed) and a bit of statistics:使用基本 R(不需要包)和一些统计数据:

Sigma = matrix(c(1,0.5,0.5,1), ncol=2)
R = chol(Sigma) # Sigma == t(R)%*%  R
n = 1000
X = t(R) %*% matrix(rnorm(n*2), 2)

X %*% t(X)/n # test

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM