繁体   English   中英

R中因素的Logistic回归误差

[英]Error in Logistic Regression for Factors in R

我正在尝试通过使用代码进行逻辑回归:

model <- glm (Participation ~ Gender + Race + Ethnicity + Education + Comorbidities + WLProgram + LoseWeight + EverLoseWeight + PastYearLW + Age + BMI, data = LogisticData, family = binomial)

摘要(模型)

我不断收到错误:

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :  contrasts can be applied only to factors with 2 or more levels

检查论坛后,我检查了哪些变量是因素:

str(LogisticData)
'data.frame':   994 obs. of  13 variables:
 $ outcome       : Factor w/ 2 levels "No","Yes": 1 1 2 2 1 2 2 1 2 2 ...
 $ Gender        : Factor w/ 3 levels "Male","Female",..: 1 2 2 1 2 1 1 1 1 
$ Race          : Factor w/ 3 levels "White","Black",..: 1 1 1 3 1 1 1 1 1 1 
$ Ethnicity     : Factor w/ 2 levels "Hispanic/Latino",..: 2 2 2 2 2 2 2 2 2 
$ Education     : Factor w/ 2 levels "Below Bachelors",..: 1 1 1 2 1 1 1 2 1 
$ Comorbidities : Factor w/ 2 levels "No","Yes": 1 1 2 1 1 1 2 2 1 1 ...
$ WLProgram     : Factor w/ 2 levels "No","Yes": NA 1 2 2 1 1 1 NA 1 1 ...
$ LoseWeight    : Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 1 1 ...
$ PastYearLW    : Factor w/ 2 levels "Yes","No": NA 2 1 1 1 2 1 NA 1 1 ...
$ EverLoseWeight: Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 1 1 ...
$ Age           : int  29 35 69 32 21 45 40 62 59 58 ...
$ Participation : Factor w/ 2 levels "Yes","No": 2 2 1 1 1 1 1 2 1 2 ...
$ BMI           : num  25.7 33.8 26.4 32.3 27.5 ...

所有因素似乎都具有2个或更多的水平。

我还尝试省略了NA,但仍然给我这个错误。

我想要回归中的所有变量,但无法弄清楚为什么它不运行。

执行时:

newdata <- droplevels(na.omit(LogisticData))
> str(newdata)
'data.frame':   840 obs. of  13 variables:
 $ outcome       : Factor w/ 2 levels "No","Yes": 1 2 2 1 2 2 2 2 2 2 ...
 $ Gender        : Factor w/ 3 levels "Male","Female",..: 2 2 1 2 1 1 1 2 1 
 $ Race          : Factor w/ 3 levels "White","Black",..: 1 1 3 1 1 1 1 1 3 
 $ Ethnicity     : Factor w/ 2 levels "Hispanic/Latino",..: 2 2 2 2 2 2 2 2 
 $ Education     : Factor w/ 2 levels "Below Bachelors",..: 1 1 2 1 1 1 1 1 
 $ Comorbidities : Factor w/ 2 levels "No","Yes": 1 2 1 1 1 2 1 1 1 2 ...
 $ WLProgram     : Factor w/ 2 levels "No","Yes": 1 2 2 1 1 1 1 1 1 1 ...
 $ LoseWeight    : Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ PastYearLW    : Factor w/ 2 levels "Yes","No": 2 1 1 1 2 1 1 1 1 2 ...
 $ EverLoseWeight: Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ Age           : int  35 69 32 21 45 40 59 58 23 32 ...
 $ Participation : Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 2 1 ...
 $ BMI           : num  33.8 26.4 32.3 27.5 45.4 ...
 - attr(*, "na.action")=Class 'omit'  Named int [1:154] 1 8 13 14 21 24 25 
46 55 58 ...
 .. ..- attr(*, "names")= chr [1:154] "1" "8" "13" "14" ...

这对我来说没有意义,因为您可以在第一个str(Logisitic Data)中看到EverLoseWeight中显然有2个级别,因为您可以看到Yes和No以及1和2? 如何解决此异常?

尝试对原始数据进行summary ,并确保所有级别都有值。 我会对此发表评论,但我没有声誉点:(

根据您的更新,看来您至少有两种可能性。

1:除去NA后,除去仅剩下一个水平的因子(即LoseWeightEverLoseWeight )。

2:将NA视为额外级别。 遵循以下原则

a = as.factor(c(1,1,NA,2))
b = as.factor(c(1,1,2,1))

# 0 is an unused factor level for a
x = data.frame(a, b)
levels(x$a) = c(levels(x$a), 0)
x$a[is.na(x$a)] = 0

但这可能无法处理任何导致单层次因素的奇异性问题。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM