简体   繁体   English

基于级别的子集数据框

[英]Subset dataframe based on levels

I have the following dataframe in R. I want to subset it based on three criteria, for each unique value of x within each level of id我在 R 中有以下数据框。我想根据三个标准对它进行子集化,对于每个id 级别中x 的每个唯一值

  1. If there is only 1 value of x, keep that row如果 x 只有 1 个值,则保留该行
  2. If x has the same value of z, with two different values of y, keep the row where y does not = 1.3如果 x 具有相同的 z 值,有两个不同的 y 值,则保留 y等于 1.3 的行
  3. If x has three values of z, keep the two rows where y does not = 1.3如果 x 具有三个 z 值,则保留 y等于 1.3 的两行

 id x y z a 1 0.2 100 a 2 1 200 a 2 1.3 200 b 1 0.5 400 b 1 1 500 b 1 1.3 600

the solution would look like this:解决方案如下所示:

 id x y z a 1 0.2 100 a 2 1 200 b 1 0.5 400 b 1 1 500

Any help would be appreciated任何帮助,将不胜感激

We can group by 'id', 'x' and filter based on the conditions我们可以按 'id'、'x' 分组并根据条件进行filter

library(dplyr)
df1 %>% 
   group_by(id, x) %>% 
   filter(n() == 1|(n() > 1 & y != 1.3))

data数据

df1 <- structure(list(id = c("a", "a", "a", "b", "b", "b"), x = c(1L, 
2L, 2L, 1L, 1L, 1L), y = c(0.2, 1, 1.3, 0.5, 1, 1.3), z = c(100L, 
200L, 200L, 400L, 500L, 600L)), class = "data.frame", row.names = c(NA, 
-6L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM