简体   繁体   中英

extract model info from model saved as list column in r

I'm trying to extract model info from model in a list column. Using mtcars to illustrate my problem:

mtcars %>%  
    nest(-cyl) %>% 
    mutate(model= map(data, ~lm(mpg~wt, data=.))) %>% 
    mutate(aic=AIC(model))

what I got is error message:

Error in mutate_impl(.data, dots) : 
  Evaluation error: no applicable method for 'logLik' applied to an object of class "list".

But when I do it this way, it works.

mtcars %>%  
    group_by(cyl) %>% 
    do(model= lm(mpg~wt, data=.)) %>% 
    mutate(aic=AIC(model))

Can anyone explain why? Why the second way works? I could not figure it out. In both cases, the list column 'model' contains model info . But there might be some differences... Thanks a lot.

Let's compare the differences between these two approaches. We can run your entire code in addition to the last AIC call and save the results to a and b .

a <- mtcars %>%  
  nest(-cyl) %>% 
  mutate(model= map(data, ~lm(mpg~wt, data=.))) 

b <- mtcars %>%  
  group_by(cyl) %>% 
  do(model= lm(mpg~wt, data=.)) 

Now we can print the results in the console.

a
# A tibble: 3 x 3
    cyl               data    model
  <dbl>             <list>   <list>
1     6  <tibble [7 x 10]> <S3: lm>
2     4 <tibble [11 x 10]> <S3: lm>
3     8 <tibble [14 x 10]> <S3: lm>

b
Source: local data frame [3 x 2]
Groups: <by row>

# A tibble: 3 x 2
    cyl    model
* <dbl>   <list>
1     4 <S3: lm>
2     6 <S3: lm>
3     8 <S3: lm>

Now we can see dataframe b is grouped by row, while dataframe a is not. This is the key.

To extract AIC in dataframe a , we can use the rowwise function to group dataframe by each row.

mtcars %>%  
  nest(-cyl) %>% 
  mutate(model= map(data, ~lm(mpg~wt, data=.))) %>%
  rowwise() %>%
  mutate(aic=AIC(model))

Source: local data frame [3 x 4]
Groups: <by row>

# A tibble: 3 x 4
    cyl               data    model      aic
  <dbl>             <list>   <list>    <dbl>
1     6  <tibble [7 x 10]> <S3: lm> 25.65036
2     4 <tibble [11 x 10]> <S3: lm> 61.48974
3     8 <tibble [14 x 10]> <S3: lm> 63.31555

Or we can use the map_dbl function because we know each AIC is numeric.

mtcars %>%  
  nest(-cyl) %>% 
  mutate(model= map(data, ~lm(mpg~wt, data=.))) %>%
  mutate(aic = map_dbl(model, AIC))

# A tibble: 3 x 4
    cyl               data    model      aic
  <dbl>             <list>   <list>    <dbl>
1     6  <tibble [7 x 10]> <S3: lm> 25.65036
2     4 <tibble [11 x 10]> <S3: lm> 61.48974
3     8 <tibble [14 x 10]> <S3: lm> 63.31555

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM