当两列在 R 中匹配时合并行

Question

我有一个 dataframe 例如；

Species   Family  Events  Groups
Monkey    A       6,7     G1,G2
Monkey    A,B     6,8,9   G1,G2,G4,G8,G12
Elephant  B       7,8     G6,G7
Elephant  C       9,10    G6
Dog       K       10      G90
Dog       L,M,N   8,10,9  G90,G91

并且想法是在Species中合并，至少在Events和Groups列之间存在匹配的列。

例如在Monkey中：

Species   Family  Events  Groups
Monkey    A       6,7     G1,G2
Monkey    A,B     6,8,9   G1,G2,G4,G8,G12

Event 6和row1中的Groups G1也在 * row2中，所以我将它们合并：

Species   Family  Events  Groups
Monkey    A,B       6,7,8,9 G1,G2,G4,G8,G12

最后，期望 output 将是：

Species   Family  Events  Groups
Monkey    A,B     6,7,8,9 G1,G2,G4,G8,G12
Elephant  B       7,8     G6,G7
Elephant  C       9,10    G6
Dog       K,L,M,N   8,9,10  G90,G91

我没有合并大象，因为Events列中不匹配。

有人知道代码的想法吗，谢谢。

以下是数据：

structure(list(Species = structure(c(3L, 3L, 2L, 2L, 1L, 1L), .Label = c("Dog", 
"Elephant", "Monkey"), class = "factor"), Family = structure(1:6, .Label = c("A", 
"A,B", "B", "C", "K", "L,M,N"), class = "factor"), Events = structure(c(2L, 
3L, 4L, 6L, 1L, 5L), .Label = c("10", "6,7", "6,8,9", "7,8", 
"8,10,9", "9,10"), class = "factor"), Groups = structure(c(1L, 
3L, 2L, 4L, 5L, 6L), .Label = c(" G1,G2", " G6,G7", "G1,G2,G4,G8,G12", 
"G6", "G90", "G90,G91"), class = "factor")), class = "data.frame", row.names = c(NA, 
-6L))

Answer 1

遵循这个策略

library(tidyverse)

df1 <- df %>% 
  group_by(Species) %>% 
  mutate(across(c(Family, Events, Groups), ~as.character(.))) %>%
  summarise(across(c(Events, Groups), ~ toString(Reduce(intersect, strsplit(., ','))))) %>%
  filter(Events != "" & Groups != "") %>%
  select(Species) 

df1 %>%
  left_join(df %>% mutate(across(c(Family, Events, Groups), ~as.character(.)))) %>%
  group_by(Species) %>%
  summarise(across(c(Family, Events, Groups), ~ toString(Reduce(union, strsplit(., ','))))) %>%
  rbind(df %>% anti_join(df1))

# A tibble: 4 x 4
  Species  Family     Events     Groups             
  <fct>    <chr>      <chr>      <chr>              
1 Dog      K, L, M, N 10, 8, 9   G90, G91           
2 Monkey   A, B       6, 7, 8, 9 G1, G2, G4, G8, G12
3 Elephant B          7,8        G6,G7              
4 Elephant C          9,10       G6

当两列在 R 中匹配时合并行

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-03-05 08:21:24

当两列在 R 中匹配时合并行

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-03-05 08:21:24

解决方案1
1 已采纳 2021-03-05 08:21:24