繁体   English   中英

Groupby 并仅保留列表中确实包含元素的组

[英]Groupby and keep only groups that do contain element in a list

我有一个df,例如:

Groups COL1
G1 SP1-3
G1 SP2s
G1 SP4_09
G1 SP7z
G3 SP1_OK
G3 SP1-9
G4 SP1_3
G4 SP2_3
G5 SP3_5

我只能对确实包含list=c('SP1','SP2')中 COL1 中的所有字符串的组进行子集化

在这里我应该得到:

Groups COL1
G1 S1-3
G1 SP2s
G1 SP4_09
G1 SP7z
G4 SP1_3
G4 SP2_3

我只保留G1G4 ,因为它们的字符串包含SP1SP2 另一个不包含两者

数据

structure(list(Groups = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 
3L, 4L), .Label = c("G1", "G3", "G4", "G5"), class = "factor"), 
    COL1 = structure(c(3L, 6L, 8L, 9L, 2L, 4L, 1L, 5L, 7L), .Label = c("SP1_3", 
    "SP1_OK", "SP1-3", "SP1-9", "SP2_3", "SP2s", "SP3_5", "SP4_09", 
    "SP7z"), class = "factor")), class = "data.frame", row.names = c(NA, 
-9L))

下面的方法应该有效。

library(dplyr)

library(stringr)

data <- structure(list(Groups = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 4L),
                                          .Label = c("G1", "G3", "G4", "G5"),
                                          class = "factor"), 
                       COL1 = structure(c(3L, 6L, 8L, 9L, 2L, 4L, 1L, 5L, 7L),
                                        .Label = c("SP1_3", "SP1_OK", "SP1-3",
                                                   "SP1-9", "SP2_3", "SP2s", "SP3_5",
                                                   "SP4_09", "SP7z"),
                                        class = "factor")),
                  class = "data.frame",
                  row.names = c(NA, -9L))


data %>% 
  group_by(Groups) %>% 
  filter(as.logical(any(str_detect(COL1, "SP1")) &
                    any(str_detect(COL1, "SP2"))))

#> # A tibble: 6 x 2
#> # Groups:   Groups [2]
#>   Groups COL1  
#>   <fct>  <fct> 
#> 1 G1     SP1-3 
#> 2 G1     SP2s  
#> 3 G1     SP4_09
#> 4 G1     SP7z  
#> 5 G4     SP1_3 
#> 6 G4     SP2_3

代表 package (v0.3.0) 于 2020 年 6 月 10 日创建

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM