简体   繁体   English

使用从 R 的 tidyverse 'map' 的 output 中提取的 lm 使用 'segmented' 时出错

[英]Error using 'segmented' with lm extracted from output of tidyverse 'map' in R

I am using the 'segmented' package to find break points in linear regressions in R我正在使用“分段”package 在 R 的线性回归中找到断点

library(tidyverse)
library(segmented)

df <- data.frame(x = c(1:10), y = c(1,1,1,1,1,6:10))
lm_model <- lm(y ~ x, data = df)
seg_model <- segmented(obj = lm_model, seg.Z = ~ x)

But if I run the same model within a purrr:map, segmented fails.但是,如果我在 purrr:map 中运行相同的 model,则分段失败。

map_test <- df %>% 
  nest() %>%
  mutate(map_lm = map(data, ~lm(y ~ x, data = .)),
         param_map_lm = map(map_lm, tidy))

map_lm_model <- map_test[[2]][[1]]

map_seg_model <- segmented(obj = map_lm_model, seg.Z = ~ x)

"Error in is.data.frame(data): object '.' “is.data.frame(数据)中的错误:object '。' not found"未找到”

When taking the lm obj from the lm extracted from the map output, segmented fails to find the underlying data.从map output中提取的lm中取出lm obj时,segmented找不到底层数据。

The two linear model objects, appear, however, identical.然而,两个线性 model 对象看起来是相同的。

What I actually need to do is a more useful map to run lm over multiple sub-sets of a dataframe, then run 'segmented' on the resulting lm.我真正需要做的是一个更有用的 map 在 dataframe 的多个子集上运行 lm,然后在生成的 lm 上运行“分段”。

This is basically the same issue as the interaction between glm() and purrr::map() .与 glm() 和 purrr::map() 之间的交互基本相同。

lm() captures the expression supplied to it, which works well as a stand-alone case. lm()捕获提供给它的表达式,它作为一个独立的案例工作得很好。 However, when called by map() , the supplied expression is .但是,当被map()调用时,提供的表达式是. , which has no meaning outside the immediate context of that map() call and results in the error you are observing. ,在该map()调用的直接上下文之外没有任何意义,并导致您观察到的错误。

As with the other question, one workaround is to define a wrapper for lm() that composes a custom call directly on the dataset, which is then captured by lm() as an unevaluated expression.与另一个问题一样,一种解决方法是为lm()定义一个包装器,该包装器直接在数据集上组成自定义调用,然后lm()将其捕获为未评估的表达式。

# Composes a custom lm() expression and evaluates it
lm2 <- function(data, ...)
    eval( rlang::expr(lm(data=!!rlang::enexpr(data), !!!list(...))) )

# Now mapping using lm2, instead of lm
map_test <- nest(df, data=everything()) %>% 
    mutate(map_lm       = map(data, lm2, y ~ x),
           param_map_lm = map(map_lm, broom::tidy))

# The data is stored directly inside the lm object
# segmented() now has no problems accessing it
map_lm_model <- map_test[[2]][[1]]
map_seg_model <- segmented(obj = map_lm_model, seg.Z = ~ x)
# Call: segmented.lm(obj = map_lm_model, seg.Z = ~x)
# 
# Meaningful coefficients of the linear terms:
# (Intercept)            x         U1.x  
#   1.000e+00    6.344e-15    1.607e+00  
# 
# Estimated Break-Point(s):
# psi1.x  
#  3.889  

or as a single mutate() chain:或作为单个mutate()链:

map_test <- nest(df, data=everything()) %>% 
    mutate(map_lm       = map(data, lm2, y ~ x),
           param_map_lm = map(map_lm, broom::tidy),
           seg_lm       = map(map_lm, segmented, seg.Z=~x))
# # A tibble: 1 x 4
#             data map_lm param_map_lm     seg_lm    
#   <list<df[,2]>> <list> <list>           <list>    
# 1       [10 × 2] <lm>   <tibble [2 × 5]> <segmentd>

You could store model data along with model object, and replace .您可以将 model 数据与 model object 一起存储,并替换. in model object's data name by stored data name :在 model 对象的数据name中按存储的数据name

map_test <- df %>% 
  nest(data=everything()) %>%
  mutate(map_lm = map(data, ~lm(y ~ x, data = . )),
         data_map_lm = data, # Store model data
         param_map_lm = map(map_lm, broom::tidy))

map_lm_model <- map_test$map_lm[[1]]
map_lm_data <- map_test$data_map_lm[[1]]

# Set stored data name in model
map_lm_model$call$data <- as.name("map_lm_data")

map_seg_model <- segmented(obj = map_lm_model, seg.Z = ~ x)
map_seg_model

# Call: segmented.lm(obj = map_lm_model, seg.Z = ~x)
# 
# Meaningful coefficients of the linear terms:
#   (Intercept)            x         U1.x  
# 1.000e+00   -1.874e-08    1.607e+00  
# 
# Estimated Break-Point(s):
#   psi1.x  
# 3.889  

On a nest with more levels:在具有更多级别的nest上:

df <- data.frame(x = c(c(1:10),c(11:20)),
                 y = rep(c(1,1,1,1,1,6:10),2),
                 level=c(rep('a',10),rep('b',10)))


map_test <- df %>%
  nest(data=c(x,y)) %>%
  mutate(map_lm = map(data, ~lm(y ~ x, data = . )),
         data_map_lm = data,
         param_map_lm = map(map_lm, broom::tidy))


map_test %>% pmap(~with(list(...),{
  map_lm$call$data <- as.name("data_map_lm")
  segmented(obj = map_lm, seg.Z = ~ x)
}))

#> [[1]]
#> Call: segmented.lm(obj = map_lm_model, seg.Z = ~x)
#> 
#> Meaningful coefficients of the linear terms:
#> (Intercept)            x         U1.x  
#>   1.000e+00   -1.874e-08    1.607e+00  
#> 
#> Estimated Break-Point(s):
#> psi1.x  
#>  3.889  
#> 
#> [[2]]
#> Call: segmented.lm(obj = map_lm_model, seg.Z = ~x)
#> 
#> Meaningful coefficients of the linear terms:
#> (Intercept)            x         U1.x  
#>   1.000e+00   -1.937e-08    1.607e+00  
#> 
#> Estimated Break-Point(s):
#> psi1.x  
#>  13.89

An option could be using map2 from purrr to immediately use the "map_lm" model on the segmented model.一个选项可以是使用purrr中的map2来立即在segmented model 上使用“map_lm”model。 In that case .y is used so that the data keeps the same name as use with the lm model.在这种情况下,使用.y以便数据保持与使用lm model 的名称相同的名称。 Here is a reproducible example:这是一个可重现的示例:

library(dplyr)
library(tidyr)
library(segmented)
library(broom)
library(purrr)

df <- data.frame(x = c(1:10), y = c(1,1,1,1,1,6:10))
lm_model <- lm(y ~ x, data = df)
seg_model <- segmented(obj = lm_model, seg.Z = ~ x)

# Combine in one dataframe
map_test <- df %>% 
  nest(data = everything()) %>%
  mutate(map_lm = purrr::map(data, ~lm(y ~ x, data = .)),
         map_seg = purrr::map2(data, map_lm, ~segmented(.y, seg.Z = ~ x)))

# Show segmented model
map_test[[3]]
#> [[1]]
#> Call: segmented.lm(obj = .y, seg.Z = ~x)
#> 
#> Meaningful coefficients of the linear terms:
#> (Intercept)            x         U1.x  
#>   1.000e+00   -1.874e-08    1.607e+00  
#> 
#> Estimated Break-Point(s):
#> psi1.x  
#>  3.889

# Plot lm and seg model
plot(map_test$data[[1]])
lines(fitted(map_test$map_lm[[1]]), col = "red")
lines(fitted(map_test$map_seg[[1]]), col = "blue")
legend(x = "topleft", 
       legend = c("lm", "seg"),
       pch = 15,
       col = c("red", "blue"))

Created on 2022-09-04 with reprex v2.0.2使用reprex v2.0.2创建于 2022-09-04

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM