简体   繁体   中英

How to use map from (purrr) package in an efficient and easy way?

I'm trying to use map from the purrr package in a more efficient way that I'm doing right now. I have 3 different datasets, let's say iris_1, iris_2, an iris_3.
I want to run the same linear regression for all 3 datasets. My final goal is to get all the coefficients from each of these 3 regressions using map .

My code looks like this:

library(purrr)
library(dplyr)
library(tidyr)

# Load data
iris <- iris

#-------------------------------------------------------------------------------------------------------------#
#Basic modifications
#-------------------------------------------------------------------------------------------------------------#
iris_1   <- iris %>% dplyr::filter(Species=="versicolor") 
iris_2   <- iris %>% dplyr::filter(Species=="virginica") 
iris_3   <- iris %>% dplyr::filter(Species=="setosa") 

Databases <- list(iris_1,iris_2,iris_3)

####Step A
Linear_Models <- map(Databases, ~ lm(Sepal.Length ~ Sepal.Width + Petal.Length , data = .x))
M_1     <- Linear_Models[[1]]
M_2     <- Linear_Models[[2]]
M_3     <- Linear_Models[[3]]

####Step B
Linear_Models_Coeff <- list(M_1,M_2,M_3)
Coeff <- map(Linear_Models_Coeff, ~ coef(summary(.x)))
C_M_1     <- Coeff[[1]]
C_M_2     <- Coeff[[2]]
C_M_3     <- Coeff[[3]]

I tried to do these previous steps in a more efficient way (this is, putting together steps A and B) by doing the following. However when I try to get the coefficients, I don't get the desired results that I get in the previous steps (ie C_M_1 <- Coeff[[1]]) .

Linear_Models <- map(Databases, ~ lm(Sepal.Length ~ Sepal.Width + Petal.Length , data = .x),~ coef(summary(.x)))
C_M_1     <- Linear_Models[[1]]

Many thanks in advance!! I know that there are multiple ways of doing this with other packages differents from purrr . But I really appreciate a help that includes the purrr package.

You could do this in one go (piping all the functions insinde map), eg

purrr::map(Databases, ~ lm(Sepal.Length ~ Sepal.Width + Petal.Length , 
                                     data = .x) %>% summary() %>% coef()) %>% 
  set_names(c("M1", "M2", "M3"))

Result:

$M1
              Estimate Std. Error  t value     Pr(>|t|)
(Intercept)  2.1164314  0.4942556 4.282059 9.063960e-05
Sepal.Width  0.2476422  0.1868389 1.325431 1.914351e-01
Petal.Length 0.7355868  0.1247678 5.895648 3.870715e-07

$M2
              Estimate Std. Error   t value     Pr(>|t|)
(Intercept)  0.6247824 0.52486745  1.190362 2.398819e-01
Sepal.Width  0.2599540 0.15333757  1.695305 9.663372e-02
Petal.Length 0.9348189 0.08960197 10.433017 8.009442e-14

$M3
              Estimate Std. Error  t value     Pr(>|t|)
(Intercept)  2.3037382 0.38529423 5.979166 2.894273e-07
Sepal.Width  0.6674162 0.09035581 7.386533 2.125173e-09
Petal.Length 0.2834193 0.19722377 1.437044 1.573296e-0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM