基于三列的r中的子集数据集

Question

In the following data frame for those records where id and name are same I want to remove those rows where class is 0在以下数据框中，对于那些 id 和 name 相同的记录，我想删除那些 class 为 0 的行

for eg 1st and 2nd record have same id and name.例如，第一条和第二条记录具有相同的 ID 和名称。 Similarly 3rd and 4th record.同样的第 3 和第 4 记录。

The final data frame will be as below最终的数据框如下

Please help how to do it in r.请帮助如何在 r 中做到这一点。 My actual dataset has thousands of such records我的实际数据集有数千条这样的记录

Here is the sample dataset这是示例数据集

Data <- data.frame(id = c(1,1,2,2,3,4,5),name = c("asd","asd","pqr","pqr","fgh","yut","kju"),
           date = c("02/03/2022","10/05/2022","23/01/2022","15/04/2022","19/05/2022","14/02/2022","10/06/2022"),
           class = c(0,1,0,1,0,0,1))

Answer 1

You may try,你可以试试，

library(dplyr)
Data %>%
  group_by(id) %>%
  filter(!(n() > 1 &  class == 0))

     id name  date       class
  <dbl> <chr> <chr>      <dbl>
1     1 asd   10/05/2022     1
2     2 pqr   15/04/2022     1
3     3 fgh   19/05/2022     0
4     4 yut   14/02/2022     0
5     5 kju   10/06/2022     1

Answer 2

Or an data.table approach:或data.table方法：

library(data.table)

setDT(Data)
unique(Data[order(id, -class)], by="name")

Output:输出：

| id|name |date       | class|
|--:|:----|:----------|-----:|
|  1|asd  |10/05/2022 |     1|
|  2|pqr  |15/04/2022 |     1|
|  3|fgh  |19/05/2022 |     0|
|  4|yut  |14/02/2022 |     0|
|  5|kju  |10/06/2022 |     1|

基于三列的r中的子集数据集

问题描述

2 个解决方案

解决方案1
4 已采纳 2022-06-15 08:31:06

解决方案2
1 2022-06-15 08:48:25

基于三列的r中的子集数据集

问题描述

2 个解决方案

解决方案1 4 已采纳 2022-06-15 08:31:06

解决方案2 1 2022-06-15 08:48:25

解决方案1
4 已采纳 2022-06-15 08:31:06

解决方案2
1 2022-06-15 08:48:25