简体   繁体   中英

R for loop to compute correlation between one x and multiple y variables

I am trying to compute the correlation of one x variable with multiple y variables. I am using a for loop similar to this code:

df <- data.frame(x = rnorm(100),
                 var1 = rnorm(100),
                 var2 = rnorm(100),
                 var3 = rnorm(100))

for (y in grep("var", colnames(df), value = TRUE)) {
  summarise(df, cor(x, y))
}

Receiving the following error message:

 Error in `summarise()`: ! Problem while computing `..1 = cor(x, y)`. Caused by error in `cor()`: ! 'y' must be numeric

My guess is, that the "y" in the correlation-function is not being interpreted as a variable name. Does someone have any hints on how to fix this?

Using dplyr::across you could do:

set.seed(123)

df <- data.frame(x = rnorm(100),
                 var1 = rnorm(100),
                 var2 = rnorm(100),
                 var3 = rnorm(100))

library(dplyr)

summarise(df, across(!x, ~ cor(x, .x)))
#>          var1      var2      var3
#> 1 -0.04953215 -0.129176 -0.044079

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM