简体   繁体   English

如何在R中的二分结果(0,1)中重新编码字符结果变量

[英]How to recode a character Outcome variable in a dichtomous Outcome (0,1) in R

I have a data set with cancer patients and different Outcomes 我有癌症患者和不同结果的数据集

TypeofOutcome        DateStageIV

NA                   01.04.2014
Died from melanoma   01.06.2011
Died from melanoma   01.11.2013

I want a new column called "Outcome" with all patients still alive coded as 1 and all dead coded as 0. From a previous exercise I created a code: 我想要一个名为“结果”的新列,将所有仍活着的患者编码为1,将所有死亡的患者编码为0。在上一个练习中,我创建了一个代码:

mergedData$Outcome <- 1* (mergedData$TypeofOutcome = c ("Alive with stable disease", "Alive with progressive disease", "Alive with complete response"))

I already assumed that this will not work and I got the Error message: 我已经假定这将不起作用,并且出现了错误消息:

Error in 1 * (mergedData$TypeofOutcome = c("Alive with stable disease", : 错误1 *(mergedData $ TypeofOutcome = c(“有稳定疾病的生命”,:
non-numeric argument to binary operator 二进制运算符的非数字参数

I am sure that there is a simple solution for my problem. 我确信有一个简单的解决方案可以解决我的问题。

If I understand you right, you want to create a dichotomous variable dependent on the value of a string variable, for example: if TypeOfOutcome matches any of "Alive with stable disease", "Alive with progressive disease" or "Alive with complete response", Outcome would be 1 otherwise 0. I assume your dataset looks similar to this 如果我理解正确,那么您想根据字符串变量的值创建一个二分变量,例如:如果TypeOfOutcome与“患有稳定疾病的患者”,“患有进行性疾病的患者”或“具有完全应答的患者”中的任何一个匹配, Outcome为1,否则为0。我假设您的数据集看起来与此类似

mergedData <- data.frame(
  TypeOfOutcome = c("Alive with stable disease", "Alive with progressive disease", "Alive with complete response", NA, "Died from melanoma"), 
  DateStageIV = sample(seq(as.Date('2011/01/01'), as.Date('2015/01/01'), by="day"), 5))


#                    TypeOfOutcome DateStageIV
# 1      Alive with stable disease  2013-05-09
# 2 Alive with progressive disease  2014-08-08
# 3   Alive with complete response  2013-02-10
# 4                           <NA>  2014-05-23
# 5             Died from melanoma  2012-08-08

The function ifelse is suitable for this from of recoding, the basic syntax is: ifelse函数适用于重新编码,其基本语法为:

ifelse(test, yes, no)

If the statment in test is true return the value of yes otherwise return the value of no . 如果test的陈述为true,则返回yes否则返回no In this case test is all cases where the patient is still alive, which is indicated by the string in TypeofOutcome being "Alive with stable disease", "Alive with progressive disease" or "Alive with complete response". 在这种情况下,将test所有患者仍然活着的情况,这由TypeofOutcome中的字符串TypeofOutcome为“疾病稳定的患者”,“疾病进展的患者”或“完全缓解的患者”。 A test for this would be: 一个测试是:

test <- mergedData$TypeOfOutcome %in% c("Alive with stable disease", "Alive with progressive disease", "Alive with complete response")

test would be TRUE if the value in TypeOfOutcome matches any of the cases after the %in% operator. 如果TypeOfOutcome中的值与%in%运算符之后的任何情况匹配,则testTRUE yes would then be 1 and no would be 0. To create the new variable yes 1, no为0。创建新变量

mergedData$Outcome <- ifelse(test, 1, 0)

mergedData

#                    TypeOfOutcome DateStageIV Outcome
# 1      Alive with stable disease  2013-05-09       1
# 2 Alive with progressive disease  2014-08-08       1
# 3   Alive with complete response  2013-02-10       1
# 4                           <NA>  2014-05-23       0
# 5             Died from melanoma  2012-08-08       0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM