如何根据 2 列的条件改变 R dyplyr 中的新变量？

Question

I have a dataset which looks like this:我有一个看起来像这样的数据集：

Recipient  ID
(chr)       (chr)  
Smith       C
Wells       S
Wells       S
Jones       S
Jones       N
Wu          C
Wu          N
Wu          S

I want to mutate a new variable, which is either "Unique" or "Multiple", based on if Recipient appears once (Unique), Recipient appears more than once but has the same ID for each occurence (Unique), Recipient appears more than once AND has 1 or more IDs (Multiple).我想改变一个新变量，它是“唯一”或“多个”，基于收件人是否出现一次（唯一），收件人出现不止一次但每次出现都具有相同的 ID（唯一），收件人出现多次一旦 AND 有 1 个或多个 ID（多个）。 I've tried to use:我试过使用：

df %>%
 group_by(Recipient, ID) %>%
 mutuate(Freq = case_when(
                str_count(Recipient) == 1 & str_count(ID) == 1 ~ "Unique",
                str_count(Recipient) > 2 & str_count(ID) == 1 ~ "Unique",
                str_count(Recipient) > 2 & str_count(ID) > 1 ~ "Multiple"))

When I did this, all the values were multiple:当我这样做时，所有的值都是多个：

Recipient  ID     Freq
(chr)      (chr)  (chr)
Smith       C     Multiple (should be Unique)
Wells       S     Multiple (should be Unique)
Wells       S     Multiple (should be Unique)
Jones       S     Multiple
Jones       N     Multiple
Wu          C     Multiple
Wu          N     Multiple
Wu          S     Multiple

I've tried multiple times, but can't crack it.我已经尝试了很多次，但无法破解它。 Can anyone help to solve this, or recommend an easier way to code this?任何人都可以帮助解决这个问题，或者推荐一种更简单的编码方法吗？ Thanks!谢谢！

Answer 1

A possible solution with n_distinct() : n_distinct()的可能解决方案：

library(dplyr)

df %>%
  group_by(Recipient) %>%
  mutate(Freq = ifelse(n_distinct(ID) == 1, "unique", "multiple")) %>%
  ungroup()

# A tibble: 8 x 3
  Recipient ID    Freq
  <chr>     <chr> <chr>
1 Smith     C     unique
2 Wells     S     unique
3 Wells     S     unique
4 Jones     S     multiple
5 Jones     N     multiple
6 Wu        C     multiple
7 Wu        N     multiple
8 Wu        S     multiple

Data数据

df <- structure(list(Recipient = c("Smith", "Wells", "Wells", "Jones", 
"Jones", "Wu", "Wu", "Wu"), ID = c("C", "S", "S", "S", "N", "C",
"N", "S")), class = "data.frame", row.names = c(NA, -8L))

Answer 2

Here is the update after clarification:这是澄清后的更新：

library(dplyr)

df %>% 
  group_by(Recipient) %>% 
  mutate(Freq = paste(Recipient, ID),
         Freq = ifelse(Freq %in% Freq[duplicated(Freq)], "unique", "multiple"),
         Freq = ifelse(Recipient %in% Recipient[duplicated(Recipient)], Freq, "unique"))

  Recipient ID    Freq    
  <chr>     <chr> <chr>   
1 Smith     C     unique  
2 Wells     S     unique  
3 Wells     S     unique  
4 Jones     S     multiple
5 Jones     N     multiple
6 Wu        C     multiple
7 Wu        N     multiple
8 Wu        S     multiple

如何根据 2 列的条件改变 R dyplyr 中的新变量？

问题描述

2 个解决方案

解决方案1
2 2022-04-21 07:00:16

Data数据

解决方案2
0 已采纳 2022-04-21 06:48:47

如何根据 2 列的条件改变 R dyplyr 中的新变量？

问题描述

2 个解决方案

解决方案1 2 2022-04-21 07:00:16

Data数据

解决方案2 0 已采纳 2022-04-21 06:48:47

解决方案1
2 2022-04-21 07:00:16

解决方案2
0 已采纳 2022-04-21 06:48:47