简体   繁体   English

为什么我的数字不匹配,R mvtnorm 中的多元 t 分布

[英]Why do my numbers not match, multivariate t distribution in R mvtnorm

I was trying to program the algorithm for the cdf for the multivariate t-distribution following Genz and Bretz, The reference package in R is mvtnorm.我试图为遵循 Genz 和 Bretz 的多元 t 分布编写 cdf 算法,R 中的参考包是 mvtnorm。

When I was testing my function, I found that my numbers don't match up.当我测试我的功能时,我发现我的数字不匹配。 In the following example, adjusted from the mvtnorm help, the multivariate t random variable has independent components.在以下示例中,根据 mvtnorm 帮助进行调整,多变量 t 随机变量具有独立分量。 So the integral should be just the product of 3 independent probabilities所以积分应该只是3个独立概率的乘积

> lower <- -1
> upper <- 3
> df <- 4
> corr <- diag(3)
> delta <- rep(0, 3)
> pmvt(lower=lower, upper=upper, delta=delta, df=df, corr=corr)
[1] 0.5300413
attr(,"error")
[1] 4.321136e-05
attr(,"msg")
[1] "Normal Completion"

The reported error is 4e-5, the error compared to the product of independent probabilities报出的误差为4e-5,误差与独立概率的乘积相比

> (pt(upper, df) - pt(lower, df))**3
[1] 0.4988254

is

0.5300413 - 0.4988254 = 0.0312159 0.5300413 - 0.4988254 = 0.0312159

I'm getting discrepancies in my own code compared to R mvtnorm for various examples in about the same range.对于大约相同范围的各种示例,我自己的代码与 R mvtnorm 相比存在差异。

I'm mostly a beginner in R. So, what am I doing wrong or what is wrong?我主要是 R 的初学者。那么,我做错了什么或有什么问题?

(I'm not signed up on a R-help mailing list, so I try here.) (我没有在 R-help 邮件列表上注册,所以我在这里尝试。)

UPDATE: As pchalasani explained, my statistics was wrong, the bug in my own code was in some helper function not in the t distribution code.更新:正如 pchalasani 解释的那样,我的统计数据是错误的,我自己代码中的错误在一些辅助函数中,而不是在 t 分发代码中。 A good way of seeing that being uncorrelated does not imply independence, is looking at the conditional distribution.查看不相关并不意味着独立的一个好方法是查看条件分布。 Here are the column frequencies %*100 for independent a bivariate random variable (10000 samples) for quartiles (distribution conditional on column variable).以下是四分位数(以列变量为条件的分布)的独立二元随机变量(10000 个样本)的列频率 %*100。

bivariate uncorrelated normal variates二元不相关正态变量

([[26, 25, 24, 23],
  [24, 23, 24, 25],
  [24, 27, 24, 24],
  [24, 23, 26, 25]])

bivariate uncorrelated t variates双变量不相关 t 变量

([[29, 20, 22, 29],
  [20, 31, 28, 21],
  [20, 29, 29, 20],
  [29, 18, 18, 29]])

The distribution in the first and last column is very different from the middle columns.第一列和最后一列的分布与中间列有很大不同。 (Sorry, no R code, since I don't know how to do this quickly with R.) (抱歉,没有 R 代码,因为我不知道如何用 R 快速做到这一点。)

Zero Correlation does not imply independence, for jointly non-Gaussian distributed random variables!零相关并不意味着独立,联合非高斯分布的随机变量!

Let me elaborate: there is no bug here.让我详细说明一下:这里没有错误。 The flaw lies in your assumption that when the multivariate Student-t random variables are uncorrelated , they are also independent , which is definitely not the case: the only class of multiavariate distributions where no correlation implies independence, is the MV Gaussian distribution.缺陷在于您的假设,即当多元 Student-t 随机变量不相关时,它们也是独立的,这绝对不是这种情况:唯一没有相关性意味着独立的多元分布类别是 MV 高斯分布。

To see that two uncorrelated random variables that jointly follow a MV Student-T distribution are not independent, consider the case of n=2 :要查看共同遵循 MV Student-T 分布的两个不相关随机变量不是独立的,请考虑n=2的情况:

require(mvtnorm)
x <- rmvt(100000, sigma = diag(2), df=4, delta = rep(0,2) )

Now each column of x represents realizations of the two random variables.现在x每一列代表两个随机变量的实现。 We first check that their correlation is fairly small:我们首先检查它们的相关性是否相当小:

> cor(x[,1], x[,2])
[1] -0.003378811

However the correlation of the squares of x[,1] and x[,2] is as high as 30.4%, ie, definitely not zero , proving that x[,1] and x[,2] are not statistically independent:然而x[,1]x[,2]平方的相关性高达 30.4%,即绝对不为零,证明x[,1]x[,2]不是统计独立的:

> cor(x[,1]^2, x[,2]^2)
[1] 0.3042684

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM