第三列中的值基于其他列的分组

Question

I need to set a label for each id in column a, based on the existing values for this id. 我需要根据此ID的现有值为a列中的每个ID设置标签。 For example, if the id 1 only has "F" then the result will be "Female", if only "M" then "Male" and if mixed, then "Mixed". 例如，如果id 1仅具有“ F”，则结果将为“ Female”，如果仅“ M”，则结果为“ Male”，如果混合，则结果为“ Mixed”。

This is the dataframe base: 这是数据框的基础：

    df=data.frame(
      a=c(1,1,1,2,2,3,3,3,3,3),
      b=c("F","M","F","M","M","F","F","F","F","F"))

And this is the expected result: 这是预期的结果：

    df$Result=c("Mixed", "Mixed", "Mixed", "Male", "Male", "Female", "Female", "Female", "Female", "Female")

       a b Result
    1  1 F  Mixed
    2  1 M  Mixed
    3  1 F  Mixed
    4  2 M   Male
    5  2 M   Male
    6  3 F Female
    7  3 F Female
    8  3 F Female
    9  3 F Female
    10 3 F Female

Someone could please help me to calculate this df$Result column? 有人可以帮助我计算此df$Result列吗？ Thanks in advance! 提前致谢！

Answer 1

After grouping by 'a', check the number of distinct elements in 'b'. 按“ a”分组后，检查“ b”中不同元素的数量。 If it is greater than 1 return "Mixed" or else return the changed label in 'b' 如果大于1，则返回“混合”，否则返回“ b”中更改的标签

library(dplyr)
df %>%
     mutate(b1 = c("Male", "Female")[(b == "F") + 1]) %>%
     group_by(a) %>%
     mutate(Result = case_when(n_distinct(b) > 1 ~ "Mixed", TRUE  ~ b1)) %>%
     select(-b1)
# A tibble: 10 x 3
# Groups:   a [3]
#       a b     Result
#   <dbl> <chr> <chr> 
# 1     1 F     Mixed 
# 2     1 M     Mixed 
# 3     1 F     Mixed 
# 4     2 M     Male  
# 5     2 M     Male  
# 6     3 F     Female
# 7     3 F     Female
# 8     3 F     Female
# 9     3 F     Female
#10     3 F     Female

data 数据

df <- data.frame(
      a=c(1,1,1,2,2,3,3,3,3,3),
      b=c("F","M","F","M","M","F","F","F","F","F"),
      stringsAsFactors = FALSE)

Answer 2

A solution with data.table : 解决方案与data.table ：

library(data.table)
a = c(1,1,1,2,2,3,3,3,3,3)
b = c("F","M","F","M","M","F","F","F","F","F")
df = data.table(a, b)

df[, result := as.character(uniqueN(b)), a]
df[, result := ifelse(result == "1", ifelse(b == "M", "Male", "Female"), "Mixed")]
df
#     a b result
#  1: 1 F  Mixed
#  2: 1 M  Mixed
#  3: 1 F  Mixed
#  4: 2 M   Male
#  5: 2 M   Male
#  6: 3 F Female
#  7: 3 F Female
#  8: 3 F Female
#  9: 3 F Female
# 10: 3 F Female

第三列中的值基于其他列的分组

问题描述

2 个解决方案

解决方案1
2 2019-08-07 13:11:49

data 数据

解决方案2
2 2019-08-07 13:19:03

第三列中的值基于其他列的分组

问题描述

2 个解决方案

解决方案1 2 2019-08-07 13:11:49

data 数据

解决方案2 2 2019-08-07 13:19:03

解决方案1
2 2019-08-07 13:11:49

解决方案2
2 2019-08-07 13:19:03