點雙數和p值

Question

我正在嘗試獲得連續詞匯得分和句法生產力（二分：有生產能力與無生產能力）之間的雙歧關系。

我嘗試了兩個ltm軟件包

> biserial.cor (lol$voc1_tvl, lol$synt, use = c("complete.obs"))

和polycor包

> polyserial( lol$voc1_tvl, lol$synt, ML = FALSE, control = list(), std.err = FALSE, maxcor=.9999, bins=4)

問題是沒有一個測試給我p值

如何運行點雙數相關測試並獲得關聯的p值，或者自己計算p值？

Answer 1

由於點雙數相關只是流行的Peason乘積矩的一個特例，因此您可以使用cor.test近似（以后再說）連續X和二分Y之間的相關。例如，給定以下內容數據：

set.seed(23049)
x <- rnorm(1e3)
y <- sample(0:1, 1e3, replace = TRUE)

運行cor.test(x, y)將為您提供所需的信息。

    Pearson's product-moment correlation

data:  x and y
t = -1.1971, df = 998, p-value = 0.2316
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.09962497  0.02418410
sample estimates:
        cor 
-0.03786575

為了表明這些系數之間的相似性，請注意-0.03786575的計算相關度與ltm::biserial.cor提供的相似度如何：

> library(ltm)
> biserial.cor(x, y, level = 2)
[1] -0.03784681

不同之處在於， biserial.cor是根據總體計算的，標准偏差除以n ，其中cor和cor.test計算樣本的標准偏差，除以n - 1 。

如cgage所述，您還可以使用polyserial()函數，在我的示例中這將產生

> polyserial(x, y, std.err = TRUE)

Polyserial Correlation, 2-step est. = -0.04748 (0.03956)
Test of bivariate normality: Chisquare = 1.891, df = 5, p = 0.864

這里，相信在所計算的相關（-0.04748）的差異是由於polyserial使用優化算法來近似計算（這是不必要的，除非Y具有兩個以上的級別）。

Answer 2

使用ggplot2數據集mpg作為可重現的示例：

library(ggplot2)
# Use class as dichotomous variable (must subset)
newData = subset(mpg, class == 'midsize' | class == 'compact')

# Now getting p-value
library(ltm)
polyserial(newData$cty,newData$class, std.err = T)

你會看到所有你想要使用輸出std.err=T在polyserial

點雙數和p值

問題描述

2 個解決方案

解決方案1
2 2017-04-17 09:49:48

解決方案2
1 2016-03-09 01:08:50

點雙數和p值

問題描述

2 個解決方案

解決方案1 2 2017-04-17 09:49:48

解決方案2 1 2016-03-09 01:08:50

解決方案1
2 2017-04-17 09:49:48

解決方案2
1 2016-03-09 01:08:50