删除条件为 R 的重复项

Question

I have a dummy dataframe with four columns.我有一个带有四列的虚拟 dataframe。

df <- data.frame(City = c("A","A","A","B","B","B","B"),
                 Name=c("Jon", "Bill","Bill", "Maria", "Ben", "Tina",'Tina'),
                 Age = c(23, 41, 32, 58, 26, 12, 15),
                 Eye_color=c("Blue","Blue", "Brown", "Brown", "Blue", "Blue","Brown"))

  City  Name Age Eye_color
1    A   Jon  23      Blue
2    A  Bill  41      Blue
3    A  Bill  32     Brown
4    B Maria  58     Brown
5    B   Ben  26      Blue
6    B  Tina  12      Blue
7    B  Tina  15     Brown

I want to remove duplicates in Names (Bill and Tina) with two different cases:我想在两种不同的情况下删除名称（Bill 和 Tina）中的重复项：

First case: group by City and remove duplicates in Names keeping the Blue eyed only.第一种情况：按城市分组并删除名称中的重复项，只保留蓝眼。 Result 1 should look like this:结果 1 应如下所示：

  City  Name Age Eye_color
1    A   Jon  23      Blue
2    A  Bill  41      Blue
3    B Maria  58     Brown
4    B   Ben  26      Blue
5    B  Tina  12      Blue

Second case: I want to specify that if the city is A, between the duplicates in Names keep Blue eye, if the City is B between the duplicates in Name keep the Brown eye.第二种情况：我想指定如果城市是 A，则名称中的重复项之间保持蓝色眼睛，如果城市是 B，名称中的重复项之间保持棕色眼睛。

Result 2 should look like this:结果 2 应如下所示：

  City  Name Age Eye_color
1    A   Jon  23      Blue
2    A  Bill  41      Blue
3    B Maria  58     Brown
4    B   Ben  26      Blue
5    B  Tina  15     Brown

Thanks for the help!谢谢您的帮助！

Answer 1

Here is one possibility using filter and dplyr :这是使用filter和dplyr的一种可能性：

First we filter for Eye_color == Blue but only if one row contains ´Blue`.首先，我们过滤Eye_color == Blue ，但前提是一行包含“Blue”。

df %>%
  group_by(Name) %>%
  filter(if (any(Eye_color == "Blue")) Eye_color == "Blue" else TRUE) %>%
  ungroup()

In the second case we use if_else in the filter statement:在第二种情况下，我们在filter语句中使用if_else ：

df %>%
  filter(if_else(Name == "Bill", Eye_color == "Blue", if_else(Name == "Tina", Eye_color == "Brown", TRUE)))

Update更新

For the new dataset you can use the same code for part 1. For part 2 simply replace the logical statements inside if_else :对于新数据集，您可以使用与第 1 部分相同的代码。对于第 2 部分，只需替换if_else中的逻辑语句：

df %>%
  filter(if_else(City == "A", Eye_color == "Blue", if_else(City == "B", Eye_color == "Brown", TRUE)))

Answer 2

You can use this code:您可以使用此代码：

df1 <- df %>%group_by(Name) %>% filter(Eye_color == "Blue")
df2 <- df %>% filter(if_else(Name == "Bill", Eye_color == "Blue", if_else(Name == "Tina", Eye_color == "Brown", TRUE)))

删除条件为 R 的重复项

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-03-02 12:06:02

Update更新

解决方案2
0 2022-03-02 12:10:51

删除条件为 R 的重复项

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-03-02 12:06:02

Update更新

解决方案2 0 2022-03-02 12:10:51

解决方案1
1 已采纳 2022-03-02 12:06:02

解决方案2
0 2022-03-02 12:10:51