简体   繁体   中英

pbcor and ggcorrmat correlations give different confidence intervals in R

I am working with multiple variables, where I would like to run a robust correlation and then extract the 95% confidence intervals. I am able to do this using pbcor from the WRS2 package.

However, when I want to plot these values, I use ggcorrmat from the ggstratsplot package. As I was checking the confidence intervals between the two outputs, I noticed they do not match up.

Here is a sample of my dataset:

Individual  varA    varB
1   2.9380842   0.09896456
2   2.9380842   -1.38772037
3   -0.6879859  -2.41310243
4   -0.6879859  0.55722346
5   -2.3129564  -1.34140699
6   -2.3129564  -1.75604301
7   -0.4937431  0.78381085
8   -0.4937431  0.38320385
9   -0.8558126  0.82125672
10  -0.8558126  0.06346062
11  -0.9211026  -1.67170174

Respective code/outputs using this sample dataset:

WRS2::pbcor(data$varA, data$varB, ci=TRUE, nboot=1000, beta=0.1) 
> robust correlation coefficient: 0.275
> test statistic: 0.8582
> p-value:0.41307
> bootstrap CI: [-0.3564; 0.7792]

ggstatsplot::ggcorrmat(data, cor.vars = c(OFT1, PC1), output = "dataframe", matrix.type = "lower", type = "robust", beta = 0.1, sig.level = 0.05, conf.level = 0.95, nboot = 1000)

>robust correlation: 0.275
>test statistic: 0.858
>p-value: 0.413
>CI: [-0.389, 0.751]

Why are the confidence intervals different, but the correlation values are the same?

You are right that the CIs differ between WRS2 and ggstatsplot because ggstatsplot internally doesn't use bootstrapping (which is slower and computationally costly) to compute the CIs.

Input <- ("
          Individual  varA    varB
1   2.9380842   0.09896456
2   2.9380842   -1.38772037
3   -0.6879859  -2.41310243
4   -0.6879859  0.55722346
5   -2.3129564  -1.34140699
6   -2.3129564  -1.75604301
7   -0.4937431  0.78381085
8   -0.4937431  0.38320385
9   -0.8558126  0.82125672
10  -0.8558126  0.06346062
11  -0.9211026  -1.67170174
          ")

# creating a dataframe
df <- read.table(textConnection(Input), header = TRUE)

set.seed(123)
WRS2::pbcor(df$varA, df$varB, ci = TRUE, nboot = 1000, beta = 0.1)
#> Call:
#> WRS2::pbcor(x = df$varA, y = df$varB, beta = 0.1, ci = TRUE, 
#>     nboot = 1000)
#> 
#> Robust correlation coefficient: 0.275
#> Test statistic: 0.8582
#> p-value: 0.41307 
#> 
#> Bootstrap CI: [-0.4476; 0.8223]

set.seed(123)
ggstatsplot::ggcorrmat(
  data = dplyr::select(df, -Individual),
  type = "robust",
  output = "dataframe",
  nboot = 1000, 
  beta = 0.1
)

#> # A tibble: 1 x 10
#>   parameter1 parameter2     r ci_low ci_high     t    df     p method       nobs
#>   <chr>      <chr>      <dbl>  <dbl>   <dbl> <dbl> <dbl> <dbl> <chr>       <int>
#> 1 varA       varB       0.275 -0.389   0.751 0.809     9 0.439 Percentage~    11

It instead returns non-central confidence intervals for effect sizes whenever it can.

If you are curious, the relevant piece of code used to compute CIs is here: https://github.com/easystats/correlation/blob/ddd105da55c8b5a81e4ce97b8938f5f00e6e968b/R/cor_to_ci.R#L70-L85

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM