简体   繁体   中英

spreading a binary variable by a grouping variable r

I have a dataset (DF) that looks like what I have below:

   ID DOB      Age Outcome    
   1  1/01/80  18     1
   1  1/01/80  18     0
   2  1/02/81  17     1
   2  1/02/81  17     0
   3  1/03/70  28     1

I want to change my database to wide format, so that I have one row per ID. However, given that DOB and Age are the same for each ID, I want these variables to be a single column in the new database and simply have multiple columns for the Outcome variable, as below:

   ID DOB      Age Outcome.1 Outcome.2    
   1  1/01/80  18     1         0
   2  1/02/81  17     1         0
   3  1/03/70  28     1         NA

I have tried using tidyr and reshape, but I can't seem to get the database into this format. For example when I use the code:

spread(DF, key=ID, value = Outcome)

I get an error that indicates that I have duplicate identifiers for rows. Is there a way to get the database into the format I would like?

Thanks.

One solution could be achieved by following steps using tidyverse . The idea is to add row number to a column to provide a unique ID for each row. Afterwards there are different ways to apply spread .

df <- read.table(text = "ID DOB      Age Outcome    
1  1/01/80  18     1
1  1/01/80  18     0
2  1/02/81  17     1
2  1/02/81  17     0
3  1/03/70  28     1", header = T, stringsAsFactors = F)

library(tidyverse)

df %>% mutate(rownum = row_number(), Outcome = paste("Outcome",Outcome,sep=".")) %>%
  spread(Outcome, rownum) %>%
  mutate(Outcome.0 = ifelse(!is.na(Outcome.0),0, NA )) %>%
  mutate(Outcome.1 = ifelse(!is.na(Outcome.1),1, NA ))

# Result:
#  ID     DOB Age Outcome.0 Outcome.1
#1  1 1/01/80  18         0         1
#2  2 1/02/81  17         0         1
#3  3 1/03/70  28        NA         1

dcast函数用于类似这样的事情。

dcast(data, ID + DOB + Age ~ Outcome)

You could use tidyr and dplyr :

   DF %>%
      group_by(ID) %>%
      mutate(OutcomeID = paste0('Outcome.', row_number())) %>%
      spread(OutcomeID, Outcome)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM