R中多元t分布的估计

Question

I would like to konw if there is any function in R that allows to estimate the df of a multivariate t distribution. 我想知道R中是否有任何函数可以估算多元t分布的df。

The problem is easy: I have a matrix of 5 variables (columns) with 75 observations (rows). 问题很容易：我有一个包含5个变量（列）和75个观察值（行）的矩阵。 I would like to estimate the df of a multivariate t on that sample. 我想估算该样本的多元t的df。

Thanks, 谢谢，

Juan. 胡安。

*** Edition: after fabians suggestions I implemented the dmvt() formula * ** * *** 版本：在fabians建议之后，我实现了dmvt（）公式 * ** *

# "residuals" is a matrix with residuals from a model. I want to estimate the df of  
# that sample assuming multivariate-t

sigma<-cor(residuals, use="pairwise.complete.obs", method="pearson")
my_means<-vector(length = 8)

for (i in 1:8){
  my_means[i]<-mean(my_matrix[,i]) 
}

residuals.scaled<-scale(residuals)
df.1 <-dmvt(residuals.scaled, my_means, sigma, log= FALSE, type = "shifted", df = 1)

I have some doubts regarding: 1) Scaling: I'm also centering the data. 我对此有一些疑问：1）缩放：我也将数据居中。 Don't know if this is correct. 不知道这是否正确。 2) Using log = FALSE as I don't know why densities should be given as log(d) in my case 3) From here I should estimate the likehood of the sample data for each df. 2）使用log = FALSE，因为我不知道为什么在我的情况下应将密度指定为log（d）3）从这里，我应该估计每个df的样本数据的似然性。 Thus, more code lines like df.2, df.3, etc should be added and then calculate the likelihood of each. 因此，应添加更多代码行，例如df.2，df.3等，然后计算每个代码行的可能性。 Then, choose the highest. 然后，选择最高的。 Is that correct? 那是对的吗？

Answer 1

Package mvtnorm supplies the density of a (shifted) multivariate t-distribution in function dmvt . 包mvtnorm供给（移位）在多元函数t分布的密度dmvt 。 You could enter your (scaled) data and its sample correlation and compute the likelihood of your data for different values of df . 您可以输入（缩放的）数据及其样本相关性，并针对不同的df值计算数据的可能性。 Pick the value of df that maximizes the likelihood of your data. 选择使数据可能性最大化的df值。

EDIT: 编辑：

library(mvtnorm)
set.seed(12121212)
################################################################################
## simulate n vectors of p-dim. t-distributed data in matrix X:
n <- 300
p <- 8

# draw random column means
means <- 10 * rnorm(p)

# correlation is AR(1) with correlation rho=.8
rho <- 0.8
sigma <- rho ^ abs(outer(1:p, 1:p, "-"))

# column s.d.s are sqrt(1:8)
df <- 3
X <- t(t(rmvt(n, sigma=sigma, delta=means, df=df)) * sqrt(1:8))


################################################################################
# evaluate t-likelihood for scaled X:

X_scale <- scale(X)
sigma_est <- cor(X_scale)

df_candidates <- seq(1, 20, by=2)
loglik <- numeric(length(df_candidates))
names(loglik) <- df_candidates
for(df in df_candidates){
    # no need for delta since we're working on scaled & centered data.
    # use sum(log(likelihood)), not prod(likelihood) to avoid numeric over/underflow 
    loglik[as.character(df)] <- sum(dmvt(x=X_scale, sigma=sigma_est, 
                                         df=df, log=TRUE))
}
loglik
#        1         3         5         7         9        11        13 
#-1788.219 -1756.301 -1768.885 -1783.724 -1797.386 -1809.556 -1820.382 
#       15        17        19 
#-1830.066 -1838.788 -1846.698 
## --> maximal for df=3, as used for the simulation.

## verify that mean shift can be incorporated into pre-processing as above:
dmvt(X[1,], delta=means) == dmvt(X[1,] - means)
#[1] TRUE

R中多元t分布的估计

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-03-10 08:42:33

R中多元t分布的估计

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-03-10 08:42:33

解决方案1
1 已采纳 2014-03-10 08:42:33