基于多个条件的子集数据帧？

Question

我有一个数据帧： df=data.frame(sample.id=c(1, 1, 2, 3, 4, 4, 5, 6, 7, 7), sample.type=c(U, S, S, U, U, D, D, U, U, D), cond = c(1.4, 17, 12, 0.45, 1, 7, 1, 9, 0, 14))

我想要一个仅包含同时具有sample.type“ U”和sample.type“ D”的sample.ids行的数据框

新df： df.new=data.frame(sample.id=c(4, 4, 7, 7), sample.type=c(U, D, U, D), cond = c(1, 7, 0, 14))

最简单的方法是什么？ 复制不起作用，因为它将返回带有U和S以及U和D的sample.id。我无法弄清楚如何为既为sample.type U又为sample.type D的示例ID进行过滤/子集设置。感谢您的任何建议！

Answer 1

我们可以按group进行filter

library(dplyr)
df %>% 
   group_by(sample.id) %>% 
   filter(all(c("U", "D") %in% sample.type))
# A tibble: 4 x 3
# Groups:   sample.id [2]
#  sample.id sample.type  cond
#      <dbl> <fct>       <dbl>
#1         4 U               1
#2         4 D               7
#3         7 U               0
#4         7 D              14

Answer 2

使用filter与any

df %>% group_by(sample.id) %>% filter(any(sample.type == 'U') & any(sample.type == 'D'))
# A tibble: 4 x 3
# Groups:   sample.id [2]
  sample.id sample.type  cond
      <dbl>      <fctr> <dbl>
1         4           U     1
2         4           D     7
3         7           U     0
4         7           D    14

Answer 3

带有data.table

library(data.table)
setDT(df)

df[, if(all(c('U', 'D') %in% sample.type)) .SD, by = sample.id]

基于多个条件的子集数据帧？

问题描述

3 个解决方案

解决方案1
2 2018-06-27 14:48:29

解决方案2
1 2018-06-27 14:49:04

解决方案3
1 2018-06-27 14:49:20

基于多个条件的子集数据帧？

问题描述

3 个解决方案

解决方案1 2 2018-06-27 14:48:29

解决方案2 1 2018-06-27 14:49:04

解决方案3 1 2018-06-27 14:49:20

解决方案1
2 2018-06-27 14:48:29

解决方案2
1 2018-06-27 14:49:04

解决方案3
1 2018-06-27 14:49:20