I am trying to loop a multiple linear regression and automatically drop factors which don't have at least two levels to avoid the following error message:
Error in
contrasts<-
(*tmp*
, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels*
Right now my code is:
df %>%
group_by(crop_name) %>%
do(tidy(lm(formula = value ~ intercrop +
erosion_c + purchased_seed + inorg_pest +
org_pest + landscape + fert + inorgfert,
data = . )))
The problem is, some crops have a large sample size with plenty of points for all the variables I'm regressing, while others have a very small sample size and zero received a given treatment (ie, no blood fruit crops were intercropped, etc.).
Is there a way to within the for loop tell R to regress what it can, drop everything else, and avoid this error message?
I am quite new, so this may not be the best way. You may need to set up a for loop with crop_name, as in my example df is your subset for one crop group.
df <- data.frame(intercrop = c("A","B","C","A","B","C"),
erosion_c = c("A","D","C","A","B","C"),
purchased_seed = c("A","B","D","F","E","C"),
inorg_pest = c("A","B","C","A","B","C"),
org_pest = c("A","B","A","A","B","B"),
landscape = c("A","A","A","A","A","A"),
fert = c("A","B","C","A","B","C"),
inorgfert = c("A","B","C","A","B","C")
)
yo <- sapply(df, levels)
hi <- as.data.frame(c(NA))
for(i in 1:length(yo)){
hi[i] <- length(yo[[i]])
names(hi)[i] <- names(df[i])
}
hi <- subset(as.data.frame(t(hi)), V1 >= 2)
formu <- row.names(hi)
formu <- as.formula(paste("value ~ ",gsub('.{3}$', '', paste( unlist(paste(formu,"+ ")), collapse=''))))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.