简体   繁体   English

R:如何进行循环的多元线性回归,从而将因子下降到<2级

[英]R: How to for loop a multiple linear regression which drops factors with <2 levels

I am trying to loop a multiple linear regression and automatically drop factors which don't have at least two levels to avoid the following error message: 我正在尝试循环执行多元线性回归并自动删除没有至少两个级别的因子,以避免出现以下错误消息:

Error in contrasts<- ( *tmp* , value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels* contrasts<-误差contrasts<-*tmp* ,value = contr.funs [1 + isOF [nn]]):对比度只能应用于具有两个或多个水平的因子*

Right now my code is: 现在我的代码是:

df %>% 
  group_by(crop_name) %>% 
    do(tidy(lm(formula = value ~ intercrop + 
erosion_c + purchased_seed + inorg_pest +
 org_pest + landscape + fert + inorgfert,
             data = . )))

The problem is, some crops have a large sample size with plenty of points for all the variables I'm regressing, while others have a very small sample size and zero received a given treatment (ie, no blood fruit crops were intercropped, etc.). 问题是,有些农作物的样本量很大,我要回归的所有变量都有很多点,而另一些农作物的样本量很小,零接受给定的处理(即没有间种有血果作物的作物等)。 )。

Is there a way to within the for loop tell R to regress what it can, drop everything else, and avoid this error message? 有没有一种方法可以让for循环告诉R退回它可以删除的所有内容,并避免出现此错误消息?

I am quite new, so this may not be the best way. 我很新,所以这可能不是最好的方法。 You may need to set up a for loop with crop_name, as in my example df is your subset for one crop group. 您可能需要使用crop_name设置for循环,因为在我的示例中,df是一个作物组的子集。

df <- data.frame(intercrop = c("A","B","C","A","B","C"),
                   erosion_c = c("A","D","C","A","B","C"),
                   purchased_seed = c("A","B","D","F","E","C"),
                   inorg_pest = c("A","B","C","A","B","C"),
                   org_pest = c("A","B","A","A","B","B"),
                   landscape = c("A","A","A","A","A","A"),
                   fert = c("A","B","C","A","B","C"),
                   inorgfert = c("A","B","C","A","B","C")
                   )


yo <- sapply(df, levels)
hi <- as.data.frame(c(NA))
for(i in 1:length(yo)){
  hi[i] <- length(yo[[i]])
  names(hi)[i] <- names(df[i])
}

hi <- subset(as.data.frame(t(hi)), V1 >= 2)

formu <- row.names(hi)
formu <- as.formula(paste("value ~ ",gsub('.{3}$', '', paste( unlist(paste(formu,"+ ")), collapse=''))))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM