多元相关的P值

Question

I have some basic questions concerning the polyserial() {polycor} function. 我有一些关于polyserial polyserial() {polycor}函数的基本问题。

Does a p-value exist for rho, or can it be calculated? rho是否存在p值，还是可以计算？
For the assumption of a bivariate normal, is the tested null hypothesis "Yes, bivariate normal"? 对于双变量正态的假设，测试的零假设是“是，双变量正常”吗？ That is, do I want a high or low p-value. 也就是说，我想要一个高或低的p值。

Thanks. 谢谢。

Answer 1

If you form the returned object with: 如果使用以下内容形成返回的对象：

 polS <- polyserial(x, y, ML=TRUE, std.err=TRUE) # ML estimate

... You should have no difficulty forming a p-value for the hypothesis: rho == 0 using a z-statistic formed by the ratio of a parameter divided by its standard error. ...你应该没有困难为假设形成一个p值： rho == 0使用由参数的比率除以其标准误差形成的z统计量。 But that is not the same as testing the assumption of bivariate normality. 但这与测试双变量正态性假设不同。 For that you need to examine "chisq" component of polS . 对于您需要检查的“CHISQ”部分polS 。 The print method for objects of class 'polycor' hands that to you in a nice little sentence. 类'polycor'对象的打印方法用一个漂亮的小句子交给你。 You interpret that result in the usual manner: Low p-values are stronger evidence against the null hypothesis (in this case H0: bivariate normality). 您以通常的方式解释该结果：低p值是针对零假设的更强证据（在这种情况下H0：双变量正态性）。 As a scientist, you do not "want" either result. 作为一名科学家，你不要“想要”任何一个结果。 You want to understand what the data is telling you. 您想了解数据告诉您的内容。

Answer 2

I e-mailed the package author -because I had the same questions) and based on his clarifications, I offer my answers: 我通过电子邮件发送了包裹作者 - 因为我有同样的问题）并根据他的澄清，我提供了我的答案：

First, the easy question: higher p-values (traditionally > 0.05) give you more confidence that the distribution is bivariate normal. 首先，一个简单的问题：较高的p值（传统上> 0.05）使您更有信心分布是双变量正态。 Lower p-values indicate a non-normal distribution, BUT, if the sample size is sufficiently large, the maximum likelihood estimate (option ML=TRUE ), non-normality doesn't matter; 较低的p值表示非正态分布，但是，如果样本量足够大，则最大似然估计（选项ML=TRUE ），非正态性无关紧要; the correlation is still reliable anyway. 无论如何，相关性仍然可靠。

Now, for the harder question: to calculate the p-value, you need to: 现在，对于更难的问题：要计算p值，您需要：

Execute polyserial with the std.err=TRUE option to have access to more details. 使用std.err = TRUE选项执行polyserial以访问更多详细信息。
From the resulting polyserial object, access the var[1, 1] element. 从生成的polyserial对象中，访问var[1, 1]元素。 var is the covariance matrix of the parameter estimates, and sqrt(var[1, 1]) is the standard error (which displays in parentheses in the output after the rho result). var是参数估计的协方差矩阵， sqrt(var[1, 1])是标准误差（在rho结果之后显示在输出中的括号中）。
From the standard error, you can calculate the p-value based on the R code below. 从标准错误中，您可以根据下面的R代码计算p值。

Here's some code to illustrate this with copiable R-code, based on the example code in the polyserial documentation: 这里有一些代码用可复制的R代码来说明这一点，基于polyserial文档中的示例代码：

library(mvtnorm)
library(polycor)

set.seed(12345)
data <- rmvnorm(1000, c(0, 0), matrix(c(1, .5, .5, 1), 2, 2))
x <- data[,1]
y <- data[,2]
y <- cut(y, c(-Inf, -1, .5, 1.5, Inf))

# 2-step estimate
poly_2step <- polyserial(x, y, std.err=TRUE)  
poly_2step
## 
## Polyserial Correlation, 2-step est. = 0.5085 (0.02413)
## Test of bivariate normality: Chisquare = 8.604, df = 11, p = 0.6584
std.err_2step <- sqrt(poly_2step$var[1, 1])
std.err_2step
## [1] 0.02413489
p_value_2step <- 2 * pnorm(-abs(poly_2step$rho / std.err_2step))
p_value_2step
## [1] 1.529176e-98
# ML estimate
poly_ML <- polyserial(x, y, ML=TRUE, std.err=TRUE) 
poly_ML
## 
## Polyserial Correlation, ML est. = 0.5083 (0.02466)
## Test of bivariate normality: Chisquare = 8.548, df = 11, p = 0.6635
## 
##                  1      2       3
## Threshold -0.98560 0.4812 1.50700
## Std.Err.   0.04408 0.0379 0.05847
std.err_ML <- sqrt(poly_ML$var[1, 1])
std.err_ML
## [1] 0.02465517
p_value_ML <- 2 * pnorm(-abs(poly_ML$rho / std.err_ML))
p_value_ML
##              
## 1.927146e-94

And to answer an important question that you didn't ask: you would want to always use the maximum likelihood version ( ML=TRUE ) because it is more accurate, except if you have a really slow computer, in which case the default 2-step approach is acceptable. 并回答一个你没有问过的重要问题：你会想要总是使用最大似然版本（ ML=TRUE ），因为它更准确，除非你有一个非常慢的计算机，在这种情况下默认2-步骤方法是可以接受的

多元相关的P值

问题描述

2 个解决方案

解决方案1
2 已采纳 2013-04-29 15:29:40

解决方案2
1 2017-05-13 14:32:43

多元相关的P值

问题描述

2 个解决方案

解决方案1 2 已采纳 2013-04-29 15:29:40

解决方案2 1 2017-05-13 14:32:43

解决方案1
2 已采纳 2013-04-29 15:29:40

解决方案2
1 2017-05-13 14:32:43