如何最好地使用 R 从长到宽重塑数据帧并组合值

Question

I have a dataframe of about 2000 rows and 3 columns.我有一个大约 2000 行和 3 列的数据框。 In essence, I want to reshape this dataframe to be wider than longer.本质上，我想将此数据框重塑为更宽而不是更长。 This is an example of my current data:这是我当前数据的示例：

ID ID	Procedure程序	Date日期
D55 D55	Sedation镇静剂	01/01/2001 01/01/2001
D55 D55	Excision切除	01/01/2001 01/01/2001
D55 D55	Biopsy活检	01/01/2001 01/01/2001
A66 A66	Sedation镇静剂	02/02/2001 02/02/2001
A66 A66	Excision切除	02/02/2001 02/02/2001
T44 T44	Sedation镇静剂	03/03/2001 03/03/2001
T44 T44	Biopsy活检	03/03/2001 03/03/2001
T44 T44	Sedation镇静剂	04/04/2001 04/04/2001
T44 T44	Excision切除	04/04/2001 04/04/2001
G88 G88	Sedation镇静剂	05/05/2001 05/05/2001
G88 G88	Biopsy活检	05/05/2001 05/05/2001
G88 G88	Sedation镇静剂	06/06/2001 06/06/2001
G88 G88	Excision切除	06/06/2001 06/06/2001
G88 G88	Sedation镇静剂	07/07/2001 07/07/2001
G88 G88	Re-excision再切除	07/07/2001 07/07/2001

I want the each row to be one line for the ID, so I'd want to create something like this:我希望每一行都是 ID 的一行，所以我想创建这样的东西：

ID ID	Date 1日期 1	Procedure(s)程序	Date 2日期 2	Procedure(s)程序	Date 3日期 3	Procedure(s)程序
D55 D55	01/01/2001 01/01/2001	Sedation, Excision, Biopsy镇静、切除、活检
A66 A66	02/02/2001 02/02/2001	Sedation, Excision镇静、切除
T44 T44	03/03/2001 03/03/2001	Sedation, Biopsy镇静、活检	04/04/2001 04/04/2001	Sedation, Excision镇静、切除
G88 G88	05/05/2001 05/05/2001	Sedation, Biopsy镇静、活检	06/06/2001 06/06/2001	Sedation, Excision镇静、切除	07/07/2001 07/07/2001	Sedation, Re-excision镇静、再切除

The majority of IDs all have the same date, but different procedures documented.大多数 ID 都具有相同的日期，但记录的程序不同。 There are a handful that came in for further procedures on subsequent dates.有少数人在随后的日期进行了进一步的程序。 I can't see any that came in for more than 3 different dates, but a way to count the dates documented per ID would be useful.我看不到超过 3 个不同日期的任何日期，但是计算每个 ID 记录的日期的方法会很有用。

I've tried using cast and dcast so far, but I'm not really getting anywhere.到目前为止，我已经尝试使用 cast 和 dcast，但我并没有真正取得任何进展。 I'm very new to R, so any help would be greatly appreciated!我对 R 很陌生，所以任何帮助将不胜感激！ Thanks for reading.谢谢阅读。

Answer 1

library(tidyverse)
df %>%
  group_by(ID, Date) %>%
  summarize(Procedure = paste0(Procedure, collapse = ", ")) %>%
  mutate(col = row_number()) %>%
  ungroup() %>%
  pivot_wider(names_from = col, values_from = c(Date, Procedure))

This currently requires some reordering afterwards, which could be done like in this answer: https://stackoverflow.com/a/60400134/6851825这目前需要一些重新排序之后，可以像在这个答案中那样完成： https : //stackoverflow.com/a/60400134/6851825

# A tibble: 4 x 7
  ID    Date_1 Date_2 Date_3 Procedure_1                Procedure_2        Procedure_3          
  <chr> <chr>  <chr>  <chr>  <chr>                      <chr>              <chr>                
1 A66   2/2/01 NA     NA     Sedation, Excision         NA                 NA                   
2 D55   1/1/01 NA     NA     Sedation, Excision, Biopsy NA                 NA                   
3 G88   5/5/01 6/6/01 7/7/01 Sedation, Biopsy           Sedation, Excision Sedation, Re-excision
4 T44   3/3/01 4/4/01 NA     Sedation, Biopsy           Sedation, Excision NA

如何最好地使用 R 从长到宽重塑数据帧并组合值

问题描述

1 个解决方案

解决方案1
1 2021-10-22 02:17:00

如何最好地使用 R 从长到宽重塑数据帧并组合值

问题描述

1 个解决方案

解决方案1 1 2021-10-22 02:17:00

解决方案1
1 2021-10-22 02:17:00