[英]R: How to for loop a multiple linear regression which drops factors with <2 levels
我正在尝试循环执行多元线性回归并自动删除没有至少两个级别的因子,以避免出现以下错误消息:
contrasts<-
误差contrasts<-
(*tmp*
,value = contr.funs [1 + isOF [nn]]):对比度只能应用于具有两个或多个水平的因子*
现在我的代码是:
df %>%
group_by(crop_name) %>%
do(tidy(lm(formula = value ~ intercrop +
erosion_c + purchased_seed + inorg_pest +
org_pest + landscape + fert + inorgfert,
data = . )))
问题是,有些农作物的样本量很大,我要回归的所有变量都有很多点,而另一些农作物的样本量很小,零接受给定的处理(即没有间种有血果作物的作物等)。 )。
有没有一种方法可以让for循环告诉R退回它可以删除的所有内容,并避免出现此错误消息?
我很新,所以这可能不是最好的方法。 您可能需要使用crop_name设置for循环,因为在我的示例中,df是一个作物组的子集。
df <- data.frame(intercrop = c("A","B","C","A","B","C"),
erosion_c = c("A","D","C","A","B","C"),
purchased_seed = c("A","B","D","F","E","C"),
inorg_pest = c("A","B","C","A","B","C"),
org_pest = c("A","B","A","A","B","B"),
landscape = c("A","A","A","A","A","A"),
fert = c("A","B","C","A","B","C"),
inorgfert = c("A","B","C","A","B","C")
)
yo <- sapply(df, levels)
hi <- as.data.frame(c(NA))
for(i in 1:length(yo)){
hi[i] <- length(yo[[i]])
names(hi)[i] <- names(df[i])
}
hi <- subset(as.data.frame(t(hi)), V1 >= 2)
formu <- row.names(hi)
formu <- as.formula(paste("value ~ ",gsub('.{3}$', '', paste( unlist(paste(formu,"+ ")), collapse=''))))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.