[英]How to recode a character Outcome variable in a dichtomous Outcome (0,1) in R
我有癌症患者和不同结果的数据集
TypeofOutcome DateStageIV
NA 01.04.2014
Died from melanoma 01.06.2011
Died from melanoma 01.11.2013
我想要一个名为“结果”的新列,将所有仍活着的患者编码为1,将所有死亡的患者编码为0。在上一个练习中,我创建了一个代码:
mergedData$Outcome <- 1* (mergedData$TypeofOutcome = c ("Alive with stable disease", "Alive with progressive disease", "Alive with complete response"))
我已经假定这将不起作用,并且出现了错误消息:
错误1 *(mergedData $ TypeofOutcome = c(“有稳定疾病的生命”,:
二进制运算符的非数字参数
我确信有一个简单的解决方案可以解决我的问题。
如果我理解正确,那么您想根据字符串变量的值创建一个二分变量,例如:如果TypeOfOutcome
与“患有稳定疾病的患者”,“患有进行性疾病的患者”或“具有完全应答的患者”中的任何一个匹配, Outcome
为1,否则为0。我假设您的数据集看起来与此类似
mergedData <- data.frame(
TypeOfOutcome = c("Alive with stable disease", "Alive with progressive disease", "Alive with complete response", NA, "Died from melanoma"),
DateStageIV = sample(seq(as.Date('2011/01/01'), as.Date('2015/01/01'), by="day"), 5))
# TypeOfOutcome DateStageIV
# 1 Alive with stable disease 2013-05-09
# 2 Alive with progressive disease 2014-08-08
# 3 Alive with complete response 2013-02-10
# 4 <NA> 2014-05-23
# 5 Died from melanoma 2012-08-08
ifelse
函数适用于重新编码,其基本语法为:
ifelse(test, yes, no)
如果test
的陈述为true,则返回yes
否则返回no
。 在这种情况下,将test
所有患者仍然活着的情况,这由TypeofOutcome
中的字符串TypeofOutcome
为“疾病稳定的患者”,“疾病进展的患者”或“完全缓解的患者”。 一个测试是:
test <- mergedData$TypeOfOutcome %in% c("Alive with stable disease", "Alive with progressive disease", "Alive with complete response")
如果TypeOfOutcome
中的值与%in%
运算符之后的任何情况匹配,则test
为TRUE
。 yes
1, no
为0。创建新变量
mergedData$Outcome <- ifelse(test, 1, 0)
mergedData
# TypeOfOutcome DateStageIV Outcome
# 1 Alive with stable disease 2013-05-09 1
# 2 Alive with progressive disease 2014-08-08 1
# 3 Alive with complete response 2013-02-10 1
# 4 <NA> 2014-05-23 0
# 5 Died from melanoma 2012-08-08 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.