简体   繁体   English

Lapply的稳健线性回归

[英]robust linear regression with lapply

I'm having problems to run a robust linear regression model (using rlm from the MASS library) over a list of dataframes. 我在数据框列表上运行健壮的线性回归模型(使用MASS库中的rlm)时遇到问题。

Reproducible example: 可重现的示例:

var1 <- c(1:100)
var2 <- var1*var1
df1  <- data.frame(var1, var2)
var1 <- var1 + 50
var2 <- var2*2
df2  <- data.frame(var1, var2)
lst1 <- list(df1, df2)

Linear model (works): 线性模型(有效):

lin_mod <- lapply(lst1, lm, formula = var1 ~ var2)
summary(lin_mod[[1]])

My code for the robust model: 我的健壮模型代码:

rob_mod <- lapply(lst1, MASS::rlm, formula = var1 ~ var2)

gives the following error: 给出以下错误:

Error in rlm.default(X[[i]], ...) : 
argument "y" is missing, with no default

How could I solve this? 我该如何解决?

The error in my actual data is: 我实际数据中的错误是:

Error in qr.default(x) : NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
In storage.mode(x) <- "double" : NAs introduced by coercion    

You can also try a purrr:map solution: 您也可以尝试使用purrr:map解决方案:

library(tidyverse)
map(lst1, ~rlm(var1 ~ var2, data=.))

or as joran commented 或如乔兰评论

map(lst1, MASS:::rlm.formula, formula = var1 ~ var2)

As you can see here ?lm provides only a formula method. 如您在这里看到的, ?lm仅提供一种公式方法。 In contrast ?rlm provides both ( formula and x, y ). 相反, ?rlm提供( formulax, y )。 Thus, you have to specify data= to say rlm to explicitly use the formula method. 因此,您必须指定data=rlm才能显式使用公式方法。 Otherwise rlm wants x and y as input. 否则, rlm希望xy作为输入。

Your call is missing the data argument. 您的呼叫缺少data参数。 lapply will call FUN with each member of the list as the first argument of FUN but data is the second argument to rlm . lapply将调用FUN作为第一个参数与所述列表中的每个成员FUNdata第二个参数来rlm

The solution is to define an anonymous function. 解决方案是定义一个匿名函数。

lin_mod <- lapply(lst1, function(DF) MASS::rlm(formula = var1 ~ var2, data = DF))
summary(lin_mod[[1]])
#
#Call: rlm(formula = var1 ~ var2, data = DF)
#Residuals:
#    Min      1Q  Median      3Q     Max 
#-18.707  -5.381   1.768   6.067   7.511 
#
#Coefficients:
#              Value   Std. Error t value
#(Intercept) 19.6977  1.0872    18.1179
#var2         0.0092  0.0002    38.2665
#
#Residual standard error: 8.827 on 98 degrees of freedom

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R 用稳健的线性回归模型 (rlm) 绘制置信区间线 - R plot confidence interval lines with a robust linear regression model (rlm) R:使用具有重复编号的列表进行稳健的线性回归 - R: Robust linear regression using a list having repeated number R 中的 Function 计算线性回归的异方差稳健置信区间 - Function in R that computes heteroskedasticity-robust confidence intervals for a linear regression 具有强大的线性模型,分位数回归和机器学习方法的逆回归程序 - Inverse regression procedures with robust linear models, quantile regression, and machine learning methods 如何使用多个线性回归模型创建一个输出,包括 Stargazer 中的 Cluster-Robust 标准误差 - how to create one output with multiple linear regression models including Cluster-Robust Standard Errors in Stargazer 用于鲁棒线性回归的哪个函数/包与glmulti一起工作(即,表现得像glm)? - Which function/package for robust linear regression works with glmulti (i.e., behaves like glm)? 稳健回归中的 MM 估计 - MM Estimation in Robust Regression sjPlot进行稳健回归? - sjPlot for robust regression? Stargazer-使用lapply进行多重线性回归模型并将其存储在R中的列表中 - Stargazer - using lapply to do mutliple linear regression models and store them in a list in R Python中的稳健多元多项式回归 - Robust Multivariate Polynomial Regression in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM