R：将一列中的每个不同值合并到另一列中

Question

I have a data that looks something like this (but actually much larger, around 100000 lines).我有一个看起来像这样的数据（但实际上要大得多，大约 100000 行）。

  ID CODE
1  A   F1
2  A   F2
3  B   F3
4  B   F1
5  C   F1
6  C   F1
7  C   F2

I need to write all different CODEs for each ID into one column.我需要将每个 ID 的所有不同代码写入一列。 I have gotten half the way by doing:我已经做到了一半：

Data %>% arrange(ID) %>% group_by(ID) %>% distinct(CODE)
  CODE  ID   
  <fct> <fct>
1 F1    A    
2 F2    A    
3 F3    B    
4 F1    B    
5 F1    C    
6 F2    C

But what I need should look something like this (where column all_CODEs holds all codes for each ID written into string):但是我需要的应该是这样的（其中 all_CODEs 列包含写入字符串的每个 ID 的所有代码）：

  ID all_CODEs
1  A     F1 F2
2  B     F3 F1
3  C     F1 F2

Can anyone help?任何人都可以帮忙吗？

Answer 1

If you are up for a base R solution, Assuming df is your dataframe:如果您想要一个基本的 R 解决方案，假设 df 是您的数据框：

df1 <- df[!duplicated(df),] ## removing duplicates basis df

aggregate( CODE ~ ID, data=df1, paste0, collapse=" ")

Output :输出：

 # ID CODE #1 A F1 F2 #2 B F3 F1 #3 C F1 F2

Answer 2

After the distinct step, we can summarise by paste ing the 'CODE' into a single string在distinct步骤之后，我们可以通过将 'CODE' paste到单个字符串中来summarise

library(dplyr)
library(stringr)
Data %>%
  arrange(ID) %>% 
  distinct() %>%
  group_by(ID) %>% 
  summarise(all_CODEs = str_c(CODE, collapse=' '))
# A tibble: 3 x 2
#  ID    all_CODEs
#  <chr> <chr>    
#1 A     F1 F2    
#2 B     F3 F1    
#3 C     F1 F2

NOTE: distinct on a single column with return only that column with the distinct rows because by default .keep_all = FALSE .注意：单列上的distinct ，仅返回具有不同行的那一列，因为默认情况下.keep_all = FALSE 。 Here, it seems that distinct should be applied on the two columns在这里，似乎应该在两列上应用distinct

data数据

Data <- structure(list(ID = c("A", "A", "B", "B", "C", "C", "C"), CODE = c("F1", 
"F2", "F3", "F1", "F1", "F1", "F2")), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7"))

R：将一列中的每个不同值合并到另一列中

问题描述

2 个解决方案

解决方案1
3 2020-02-18 17:17:30

解决方案2
2 已采纳 2020-02-18 17:03:02

data数据

R：将一列中的每个不同值合并到另一列中

问题描述

2 个解决方案

解决方案1 3 2020-02-18 17:17:30

解决方案2 2 已采纳 2020-02-18 17:03:02

data数据

解决方案1
3 2020-02-18 17:17:30

解决方案2
2 已采纳 2020-02-18 17:03:02