繁体   English   中英

R-将因子的参考水平设置为NA

[英]R - set reference level of factor to NA

我有一个data.table,其中有一些值为NA的因子列。 我特意将NA作为因子的级别(例如, x <- factor(x, exclude=NULL) -factor x <- factor(x, exclude=NA) ,而不是x <- factor(x, exclude=NA)的默认行为),因为NA对于我的模特。 对于这些因子列,我希望将对NA的引用级别进行relevel() ,但是我在语法上苦苦挣扎。

# silly reproducible example
library(data.table)
a <- data.table(animal = c("turkey","platypus","dolphin"),
            mass_kg = c(8, 2, 200),
            egg_size= c("large","small",NA),
            intelligent=c(0,0,1)
            )
lr <- glm(intelligent ~ mass_kg + egg_size, data=a, family = binomial)
summary(lr) 

# By default, egg_size is converted to a factor with no level for NA
# However, in this case NA is meaningful (since most mammals don't lay eggs)

a[,egg_size:=factor(egg_size, exclude=NULL) ] # exclude=NULL allows an NA level

lr <- glm(intelligent ~ mass_kg + egg_size, data=a, family = binomial)
summary(lr) # Now NA is included in the model, but not as the reference level

a[,levels(egg_size)] # Returns: [1] "large" "small" NA    

a[,egg_size:=relevel(egg_size,ref=NA)]
# Returns:
# Error in relevel.factor(egg_size, ref = NA) : 
#   'ref' must be an existing level

relevel()的正确语法是什么,还是我需要使用其他东西? 非常感谢。

您必须指定正确的NA类型,即NA_character_ ,但这会抛出NA ,这可能是一个错误。 一种解决方法是直接自己指定级别:

# throw out NA's to begin with
egg_size = factor(c("large","small",NA), exclude = NA)

# but then add them back at the beginning
factor(egg_size, c(NA, levels(egg_size)), exclude = NULL)
#[1] large small <NA> 
#Levels: <NA> large small

如果您想知道, c会将NAlogical转换为正确的类型。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM