第1组的唯一值，然后是第1组和第2组，依此类推

Question

I have a dataframe with 5 different groups : 我有一个包含5个不同组的数据框：

   id group
1  L1     1
2  L2     1
3  L1     2
4  L3     2
5  L4     2
6  L3     3
7  L5     3
8  L6     3
9  L1     4
10 L4     4
11 L2     5

I would like to know if it's possible to get the unique id from the 1st group, the 1st and the 2nd, the 1st, 2nd and 3rd and so on without for looping. 我想知道是否可以从第一组，第一组和第二组，第一，第二和第三组获得唯一id ，依此类推而不进行循环。 I'm searching a way with dplyr or data.table package. 我正在用dplyr或data.table包搜索。

Expected results : 预期成绩：

    group      id
1   1          c("L1", "L2")
2   1,2        c("L1", "L2", "L3", "L4")
3   1,2,3      c("L1", "L2", "L3", "L4", "L5")
4   1,2,3,4    c("L1", "L2", "L3", "L4", "L5")
5   1,2,3,4,5  c("L1", "L2", "L3", "L4", "L5")

Data : 数据：

structure(list(id = c("L1", "L2", "L1", "L3", "L4", "L3", "L5", 
"L6", "L1", "L4", "L2"), group = structure(c(1L, 1L, 2L, 2L, 
2L, 3L, 3L, 3L, 4L, 4L, 5L), .Label = c("1", "2", "3", "4", "5"
), class = "factor")), .Names = c("id", "group"), row.names = c(NA, 
-11L), class = "data.frame")

Answer 1

With base R, you can do: 使用基数R，您可以：

# create the "growing" sets of groups
combi_groups <- lapply(seq_along(unique(df$group)), function(i) unique(df$group)[1:i])

# get the unique ID for each set of groups
uniq_ID <- setNames(lapply(combi_groups, function(x) unique(df$id[df$group %in% x])), 
                    sapply(combi_groups, paste, collapse=","))

# $`1`
# [1] "L1" "L2"

# $`1,2`
# [1] "L1" "L2" "L3" "L4"

# $`1,2,3`
# [1] "L1" "L2" "L3" "L4" "L5" "L6"

# $`1,2,3,4`
# [1] "L1" "L2" "L3" "L4" "L5" "L6"

# $`1,2,3,4,5`
# [1] "L1" "L2" "L3" "L4" "L5" "L6"

If you want to format as in your expected output: 如果要按预期输出格式化：

data.frame(group=sapply(combi_groups, paste, collapse=", "), id=sapply(uniq_ID, function(x) paste0("c(", paste0("\"", x, "\"", collapse=", "), ")")))
#          group                                    id
#1             1                         c("L1", "L2")
#2          1, 2             c("L1", "L2", "L3", "L4")
#3       1, 2, 3 c("L1", "L2", "L3", "L4", "L5", "L6")
#4    1, 2, 3, 4 c("L1", "L2", "L3", "L4", "L5", "L6")
#5 1, 2, 3, 4, 5 c("L1", "L2", "L3", "L4", "L5", "L6")

Another possibility of formatting: 格式化的另一种可能性

data.frame(group=rep(names(uniq_ID), sapply(uniq_ID, length)), id=unlist(uniq_ID))

Or, if you want to have uniq_ID in a column: 或者，如果您想在列中包含uniq_ID ：

library(data.table)
data.table(group=sapply(combi_groups, paste, collapse=", "), id=uniq_ID)
#           group                id
#1:             1             L1,L2
#2:          1, 2       L1,L2,L3,L4
#3:       1, 2, 3 L1,L2,L3,L4,L5,L6
#4:    1, 2, 3, 4 L1,L2,L3,L4,L5,L6
#5: 1, 2, 3, 4, 5 L1,L2,L3,L4,L5,L6

data.table(group=sapply(combi_groups, paste, collapse=", "), id=uniq_ID)[2, id]
[[1]]
[1] "L1" "L2" "L3" "L4"

Answer 2

In similar vein as the answer of @Cath, but using Reduce(..., accumulate = TRUE) to create the expanding window of groups. 与@Cath的答案类似，但使用Reduce(..., accumulate = TRUE)来创建组的扩展窗口。 Then loop over the set of groups with lapply to get the unique id's for each window: 然后使用lapply循环遍历组，以获取每个窗口的唯一ID：

grp <- Reduce(c, unique(d$group), accumulate = TRUE)

lapply(grp, function(x) unique(d$id[d$group %in% x]))
# [[1]]
# [1] "L1" "L2"
# 
# [[2]]
# [1] "L1" "L2" "L3" "L4"
# 
# [[3]]
# [1] "L1" "L2" "L3" "L4" "L5" "L6"
# 
# [[4]]
# [1] "L1" "L2" "L3" "L4" "L5" "L6"
# 
# [[5]]
# [1] "L1" "L2" "L3" "L4" "L5" "L6"

For naming and prettification, please refer to the nice answer by @Cath. 有关命名和美化，请参阅@Cath的好答案。

Answer 3

Another method is to use split and Reduce to feed the groups to union with accumulate=TRUE: 另一种方法是使用split和Reduce将组提供给union ， union使用accumulate = TRUE：

Reduce(union, split(df$id, df$group), accumulate=TRUE)
[[1]]
[1] "L1" "L2"

[[2]]
[1] "L1" "L2" "L3" "L4"

[[3]]
[1] "L1" "L2" "L3" "L4" "L5" "L6"

[[4]]
[1] "L1" "L2" "L3" "L4" "L5" "L6"

[[5]]
[1] "L1" "L2" "L3" "L4" "L5" "L6"

第1组的唯一值，然后是第1组和第2组，依此类推

问题描述

3 个解决方案

解决方案1
8 已采纳 2017-05-09 09:40:56

解决方案2
6 2017-05-09 10:07:47

解决方案3
4 2017-05-09 12:15:47

第1组的唯一值，然后是第1组和第2组，依此类推

问题描述

3 个解决方案

解决方案1 8 已采纳 2017-05-09 09:40:56

解决方案2 6 2017-05-09 10:07:47

解决方案3 4 2017-05-09 12:15:47

解决方案1
8 已采纳 2017-05-09 09:40:56

解决方案2
6 2017-05-09 10:07:47

解决方案3
4 2017-05-09 12:15:47