將因子轉換為 R 中的多列

Question

我有一個包含 PatientID 和他們的疾病測試結果的數據集，它們如下：

Id  Result
1   Strep A: Positive
2   Flu A: Negative, Flu B: Negative
3   Rsv: Positive, RsvA: Negative, RsvB: Positive
4   Strep A: Negative
5   Flu A: Negative, Flu B: Negative
6   Flu A: Negative, Flu B: Negative
7   Strep A: Positive

如何拆分Result列如下：

Id  Result_Strep A  Result_Flu A    Result_Flu B    Result_Rsv  Result_RsvA Result_RsvB
1   Positive        NA              NA              NA          NA          NA
2   NA              Negative        Negative        NA          NA          NA
3   NA              NA              NA              Positive    Negative    Positive
4   Negative        NA              NA              NA          NA          NA
5   NA              Negative        Negative        NA          NA          NA
6   NA              Negative        Negative        NA          NA          NA
7   Positive        NA              NA              NA          NA          NA

數據輸出

structure(list(Id = 1:7, Result = c("Strep A: Positive", "Flu A: Negative, Flu B: Negative", 
"Rsv: Positive, RsvA: Negative, RsvB: Positive", "Strep A: Negative", 
"Flu A: Negative, Flu B: Negative", "Flu A: Negative, Flu B: Negative", 
"Strep A: Positive")), row.names = c(NA, -7L), class = "data.frame")

Answer 1

我們可以使用separate_rows在拆分, ，然后separate列一分為二，並重塑成“寬”格式

library(dplyr)
library(tidyr)
library(stringr)
df1 %>%
   separate_rows(Result, sep=", ") %>% 
   separate(Result, into = c("Result1", "Result2"), sep=":\\s*") %>% 
   mutate(Result1 = str_c("Result_", Result1)) %>%
   # in case of duplicate elements uncomment the commented code below
   #group_by(Result1) %>%       
   #mutate(rn = row_number()) %>% 
   #ungroup %>%
   pivot_wider(names_from = Result1, values_from = Result2)# %>%
   #select(-rn)
# A tibble: 7 x 7
#     Id `Result_Strep A` `Result_Flu A` `Result_Flu B` Result_Rsv Result_RsvA Result_RsvB
#  <int> <chr>            <chr>          <chr>          <chr>      <chr>       <chr>      
#1     1 Positive         <NA>           <NA>           <NA>       <NA>        <NA>       
#2     2 <NA>             Negative       Negative       <NA>       <NA>        <NA>       
#3     3 <NA>             <NA>           <NA>           Positive   Negative    Positive   
#4     4 Negative         <NA>           <NA>           <NA>       <NA>        <NA>       
#5     5 <NA>             Negative       Negative       <NA>       <NA>        <NA>       
#6     6 <NA>             Negative       Negative       <NA>       <NA>        <NA>       
#7     7 Positive         <NA>           <NA>           <NA>       <NA>        <NA>

數據

df1 <- structure(list(Id = 1:7, Result = c("Strep A: Positive", 
 "Flu A: Negative, Flu B: Negative", 
"Rsv: Positive, RsvA: Negative, RsvB: Positive", "Strep A: Negative", 
"Flu A: Negative, Flu B: Negative", "Flu A: Negative, Flu B: Negative", 
"Strep A: Positive")), class = "data.frame", row.names = c(NA, 
-7L))

將因子轉換為 R 中的多列

問題描述

1 個解決方案

解決方案1
2 已采納 2019-12-12 22:24:39

數據

將因子轉換為 R 中的多列

問題描述

1 個解決方案

解決方案1 2 已采納 2019-12-12 22:24:39

數據

解決方案1
2 已采納 2019-12-12 22:24:39