![](/img/trans.png)
[英]How to debug "contrasts can be applied only to factors with 2 or more levels" error?
[英]How to do a GLM when “contrasts can be applied only to factors with 2 or more levels”?
我想使用glm
在R中進行回歸,但是有一種方法,因為我得到了對比度誤差。
mydf <- data.frame(Group=c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12),
WL=rep(c(1,0),12),
New.Runner=c("N","N","N","N","N","N","Y","N","N","N","N","N","N","Y","N","N","N","Y","N","N","N","N","N","Y"),
Last.Run=c(1,5,2,6,5,4,NA,3,7,2,4,9,8,NA,3,5,1,NA,6,10,7,9,2,NA))
mod <- glm(formula = WL~New.Runner+Last.Run, family = binomial, data = mydf)
#Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
# contrasts can be applied only to factors with 2 or more levels
使用此處定義的debug_contr_error
和debug_contr_error2
函數: 如何調試“對比度只能應用於具有兩個或多個級別的因數”錯誤? 我們可以輕松地找到問題所在:變量New.Runner
只剩下一個級別。
info <- debug_contr_error2(WL ~ New.Runner + Last.Run, mydf)
info[c(2, 3)]
#$nlevels
#New.Runner
# 1
#
#$levels
#$levels$New.Runner
#[1] "N"
## the data frame that is actually used by `glm`
dat <- info$mf
不能將單個級別的因數應用於對比度,因為任何種類的對比都會使級別數減少1
。 通過1 - 1 = 0
該變量將從模型矩陣中刪除。
那么,我們可以簡單地要求不對單個級別的因素應用任何對比嗎? 否。所有對比方法均禁止這樣做:
contr.helmert(n = 1, contrasts = FALSE)
#Error in contr.helmert(n = 1, contrasts = FALSE) :
# not enough degrees of freedom to define contrasts
contr.poly(n = 1, contrasts = FALSE)
#Error in contr.poly(n = 1, contrasts = FALSE) :
# contrasts not defined for 0 degrees of freedom
contr.sum(n = 1, contrasts = FALSE)
#Error in contr.sum(n = 1, contrasts = FALSE) :
# not enough degrees of freedom to define contrasts
contr.treatment(n = 1, contrasts = FALSE)
#Error in contr.treatment(n = 1, contrasts = FALSE) :
# not enough degrees of freedom to define contrasts
contr.SAS(n = 1, contrasts = FALSE)
#Error in contr.treatment(n, base = if (is.numeric(n) && length(n) == 1L) n else length(n), :
# not enough degrees of freedom to define contrasts
實際上,如果仔細考慮,您將得出結論, 沒有對比,具有單個水平的因子只是所有1的虛擬變量,即intercept 。 因此,您絕對可以執行以下操作:
dat$New.Runner <- 1 ## set it to 1, as if no contrasts is applied
mod <- glm(formula = WL ~ New.Runner + Last.Run, family = binomial, data = dat)
#(Intercept) New.Runner Last.Run
# 1.4582 NA -0.2507
由於等級不足,您將獲得New.Runner
的NA
系數。 實際上, 應用對比是避免等級不足的一種基本方法 。 只是當一個因素只有一個層次時,對比的應用就變成了一個悖論。
我們還來看看模型矩陣:
model.matrix(mod)
# (Intercept) New.Runner Last.Run
#1 1 1 1
#2 1 1 5
#3 1 1 2
#4 1 1 6
#5 1 1 5
#6 1 1 4
#8 1 1 3
#9 1 1 7
#10 1 1 2
#11 1 1 4
#12 1 1 9
#13 1 1 8
#15 1 1 3
#16 1 1 5
#17 1 1 1
#19 1 1 6
#20 1 1 10
#21 1 1 7
#22 1 1 9
#23 1 1 2
(intercept)
和New.Runner
具有相同的列,並且只能估計其中之一。 如果要估算New.Runner
,則刪除截距:
glm(formula = WL ~ 0 + New.Runner + Last.Run, family = binomial, data = dat)
#New.Runner Last.Run
# 1.4582 -0.2507
確保您徹底消化了排名不足的問題。 如果您有一個以上的單層因子,並將它們全部替換為1,則丟棄單個截距仍會導致秩不足。
dat$foo.factor <- 1
glm(formula = WL ~ 0 + New.Runner + foo.factor + Last.Run, family = binomial, data = dat)
#New.Runner foo.factor Last.Run
# 1.4582 NA -0.2507
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.