根据 R 中的 2 个不同变量将一个 ID 的多行组合成一行多列

Question

I am working with a dataframe in R that looks like this:我正在使用 R 中的 dataframe ，如下所示：

id <- c(1,1,1,2,2,3,3,3,3)
dx_code <- c("MI","HF","PNA","PNA","Cellulitis","MI","Flu","Sepsis","HF")
dx_date <- c("7/11/22","7/11/22","8/1/22","8/4/22","8/7/22","8/4/22","7/11/22","7/11/22","9/10/22")
df <- data.frame(id, dx_code, dx_date)
df

I want to be able to group it so that each patient ID has each date they were seen and each diagnosis they received on each specific date.我希望能够对其进行分组，以便每个患者 ID 都有他们被看到的每个日期以及他们在每个特定日期收到的每个诊断。 So it would look something like:所以它看起来像：

id2 <- c(1,2,3)
dx_date1 <- c("7/11/22","8/4/22","8/4/22")
dx_date1code1 <- c("MI","PNA","MI")
dx_date1code2 <- c("HF",NA,NA)
dx_date2 <- c("8/1/22","8/7/22","7/11/22")
dx_date2code1 <- c("PNA","Cellulitis","Flu")
dx_date2code2 <- c(NA,NA,"Sepsis")
dx_date3 <- c(NA,NA,"9/10/22")
dx_date3code1 <- c(NA,NA,"HF")
df2 <- data.frame(id2, dx_date1, dx_date1code1,dx_date1code2,dx_date2,dx_date2code1,dx_date2code2,dx_date3,dx_date3code1)
df2

I am not sure how to reformat it in this way - is there a function in R, or should I try to use for loops?我不确定如何以这种方式重新格式化它 - R 中是否有 function，或者我应该尝试使用 for 循环？ I would appreciate any help - thanks so much!我将不胜感激 - 非常感谢！

Answer 1

We can use a pivot_longer followed by a pivot_wider after some modifications.经过一些修改后，我们可以使用一个pivot_longer ，然后是一个pivot_wider 。

library(tidyr)
 df3 <- df %>% group_by(id) %>% 
   mutate(
     dx_code =paste('code', row_number()), 
     dx_date =paste('date', row_number())) %>% 
   pivot_longer(cols = c('dx_code', 'dx_date')) %>% 
   mutate(value = ifelse(grepl('code', name), dx_code, dx_date)) %>% 
   group_by(id, name) %>% mutate(
     name = paste(name, row_number()))

      id name      value     
   <dbl> <chr>     <chr>     
 1     1 dx_code 1 MI        
 2     1 dx_date 1 7/11/22   
 3     1 dx_code 2 PNA       
 4     1 dx_date 2 8/4/22

We now have the data in a format suitable for pivot_wider .我们现在拥有适合pivot_wider格式的数据。

df3 %>% pivot_wider(names_from = name, values_from = value) 

# Groups:   id [3]
     id `dx_code 1` `dx_date 1` `dx_code 2` `dx_date 2` `dx_code 3` `dx_date 3` `dx_code 4` `dx_date 4`
  <dbl> <chr>       <chr>       <chr>       <chr>       <chr>       <chr>       <chr>       <chr>      
1     1 MI          7/11/22     PNA         8/4/22      Cellulitis  8/4/22      NA          NA         
2     2 MI          7/11/22     PNA         8/4/22      NA          NA          NA          NA         
3     3 MI          7/11/22     PNA         8/4/22      Cellulitis  8/4/22      Flu         7/11/22

Answer 2

I believe you can use pivot_wider for this.我相信您可以为此使用pivot_wider 。 The output is not the same is in the original post, but similar to what you provided in your comment. output 与原始帖子中的不同，但与您在评论中提供的内容相似。

You can enumerate dates and codes after grouping by id using row_number() .您可以使用row_number()按id分组后枚举日期和代码。

After using pivot_wider , you can select column names based on the numeric value contained, which will reorder so that dates and codes columns are next to each other.使用pivot_wider后，您可以根据包含的数值select列名称，这将重新排序，以便日期和代码列彼此相邻。

library(tidyverse)

df %>%
  group_by(id) %>%
  mutate(code_num = row_number()) %>%
  pivot_wider(id_cols = id, 
              names_from = code_num, 
              values_from = c(dx_date, dx_code)) %>%
  select(id, names(.)[-1][order(readr::parse_number(names(.)[-1]))])

Output Output

     id dx_date_1 dx_code_1 dx_date_2 dx_code_2  dx_date_3 dx_code_3 dx_date_4 dx_code_4
  <dbl> <chr>     <chr>     <chr>     <chr>      <chr>     <chr>     <chr>     <chr>    
1     1 7/11/22   MI        7/11/22   HF         8/1/22    PNA       NA        NA       
2     2 8/4/22    PNA       8/7/22    Cellulitis NA        NA        NA        NA       
3     3 8/4/22    MI        7/11/22   Flu        7/11/22   Sepsis    9/10/22   HF

根据 R 中的 2 个不同变量将一个 ID 的多行组合成一行多列

问题描述

1 个解决方案

解决方案1
0 2022-08-13 21:05:19

解决方案2
0 2022-08-15 01:02:31

根据 R 中的 2 个不同变量将一个 ID 的多行组合成一行多列

问题描述

1 个解决方案

解决方案1 0 2022-08-13 21:05:19

解决方案2 0 2022-08-15 01:02:31

解决方案1
0 2022-08-13 21:05:19

解决方案2
0 2022-08-15 01:02:31