简体   繁体   English

R:将一列中的每个不同值合并到另一列中

[英]R: unite every distinct value in one column into another

I have a data that looks something like this (but actually much larger, around 100000 lines).我有一个看起来像这样的数据(但实际上要大得多,大约 100000 行)。

  ID CODE
1  A   F1
2  A   F2
3  B   F3
4  B   F1
5  C   F1
6  C   F1
7  C   F2

I need to write all different CODEs for each ID into one column.我需要将每个 ID 的所有不同代码写入一列。 I have gotten half the way by doing:我已经做到了一半:

Data %>% arrange(ID) %>% group_by(ID) %>% distinct(CODE)
  CODE  ID   
  <fct> <fct>
1 F1    A    
2 F2    A    
3 F3    B    
4 F1    B    
5 F1    C    
6 F2    C 

But what I need should look something like this (where column all_CODEs holds all codes for each ID written into string):但是我需要的应该是这样的(其中 all_CODEs 列包含写入字符串的每个 ID 的所有代码):

  ID all_CODEs
1  A     F1 F2
2  B     F3 F1
3  C     F1 F2

Can anyone help?任何人都可以帮忙吗?

If you are up for a base R solution, Assuming df is your dataframe:如果您想要一个基本的 R 解决方案,假设 df 是您的数据框:

df1 <- df[!duplicated(df),] ## removing duplicates basis df

aggregate( CODE ~ ID, data=df1, paste0, collapse=" ")

Output :输出

 # ID CODE #1 A F1 F2 #2 B F3 F1 #3 C F1 F2

After the distinct step, we can summarise by paste ing the 'CODE' into a single stringdistinct步骤之后,我们可以通过将 'CODE' paste到单个字符串中来summarise

library(dplyr)
library(stringr)
Data %>%
  arrange(ID) %>% 
  distinct() %>%
  group_by(ID) %>% 
  summarise(all_CODEs = str_c(CODE, collapse=' '))
# A tibble: 3 x 2
#  ID    all_CODEs
#  <chr> <chr>    
#1 A     F1 F2    
#2 B     F3 F1    
#3 C     F1 F2    

NOTE: distinct on a single column with return only that column with the distinct rows because by default .keep_all = FALSE .注意:单列上的distinct ,仅返回具有不同行的那一列,因为默认情况下.keep_all = FALSE Here, it seems that distinct should be applied on the two columns在这里,似乎应该在两列上应用distinct

data数据

Data <- structure(list(ID = c("A", "A", "B", "B", "C", "C", "C"), CODE = c("F1", 
"F2", "F3", "F1", "F1", "F1", "F2")), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R:如果另一列具有不同的值,如何使用aggregate()函数对某一列的数据求和? - R: how to use the aggregate()-function to sum data from one column if another column has a distinct value? 如何将一个 df 列中的每个值与 R 中另一个 df 列中的每个值进行比较? 具有不同行号的dfs - How to compare every value from a column from one df with every value from a column of another df in R? dfs with different row numbers 按组使用 distinct() 并以 R 中另一列的值为条件 - Using distinct() by group and conditional on a value from another column in R 有没有办法在R中用方括号从一列选择另一列? - Is there a way to select every column from one column to another with brackets in R? 由R中的不同列值求和 - Sum by distinct column value in R 根据r中的另一列更改一列的值 - Changing the value of one column based on another in r R:将值从一列赋值给另一列 - R: assigning value from one column to another 根据另一个 R 检查一列的值 - Check value of one column based on another R 对于一个列值的每增加一个单位,另一列条目会增加 - For every unit increase in one column value , another column entries increase R-根据另一列中的NA值更改一列中的值 - R - Change value in one column based on NA value in another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM