dplyr，purrr 或类似的过程来替换 R 中的 for 循环

Question

我有一个给定数量的群体分布； 例如：{2, 4, 1, 1, 2, 3}，其中第 1 组有 2 个人，第 2 组有 4 个人，第 3 组有 1 人，第 4 组有 1 人，等等。我想在那里建一个表是每个组/个人组合的唯一行。 （问题底部的所需表格格式）。

我目前使用 for 循环：

num.groups <- 10
mu <- 4
sd <- 1

group.dist <- round(rnorm(num.groups, mean = mu, sd = sd))

xx <- NULL
for (i in 1:length(group.dist)) {
  temp <- data.frame(Group = i, Individual = 1:group.dist[i])
  xx <- rbind(xx, temp)
}

我试图摆脱一般的 for 循环，我的代码的实际版本有数百个组，我将运行模拟数千次，所以我希望有一种更有效的方法来做到这一点.

如果有人已经问过这个问题，我深表歉意，这是一个很难用谷歌搜索的具体情况。 谢谢！

该表将如下所示：

Answer 1

例如：

library(tidyverse)
d <- tibble(Group = seq_along(group.dist), n = group.dist)

uncount(d, n, .id = 'Individual')

 # A tibble: 45 × 2 # Groups: Group [10] Group Individual <int> <int> 1 1 1 2 1 2 3 1 3 4 1 4 5 2 1 6 2 2 7 2 3 8 2 4 9 3 1 10 3 2 # … with 35 more rows

Answer 2

这里还有两种方法：

library(data.table)
data.table(Group=1:num.groups)[, .(Individual = seq(1,group.dist[.BY$Group])), by=Group]

或者：

do.call(rbind, lapply(1:num.groups, function(x) data.frame("Group" = x, Individual = 1:group.dist[x])))

Answer 3

另一种可能的解决方案，基于dplyr::group_modify ：

library(tidyverse)

num.groups <- 10
mu <- 4
sd <- 1

group.dist <- round(rnorm(num.groups, mean = mu, sd = sd))

data.frame(Group = rep(1:num.groups, group.dist)) %>% 
  group_by(Group) %>% 
  group_modify(~ add_column(.x, Individual = 1:nrow(.x))) %>% 
  ungroup

#> # A tibble: 39 x 2
#>    Group Individual
#>    <int>      <int>
#>  1     1          1
#>  2     1          2
#>  3     1          3
#>  4     1          4
#>  5     2          1
#>  6     2          2
#>  7     2          3
#>  8     2          4
#>  9     3          1
#> 10     3          2
#> # ... with 29 more rows

或者，更好的是，按照@Axeman 的建议：

data.frame(Group = rep(1:num.groups, group.dist)) %>% 
  group_by(Group) %>% 
  mutate(Individual = row_number())
  ungroup

dplyr，purrr 或类似的过程来替换 R 中的 for 循环

问题描述

3 个解决方案

解决方案1
1 已采纳 2022-03-23 18:37:56

解决方案2
0 2022-03-23 18:46:49

解决方案3
0 2022-03-23 18:59:02

dplyr，purrr 或类似的过程来替换 R 中的 for 循环

问题描述

3 个解决方案

解决方案1 1 已采纳 2022-03-23 18:37:56

解决方案2 0 2022-03-23 18:46:49

解决方案3 0 2022-03-23 18:59:02

解决方案1
1 已采纳 2022-03-23 18:37:56

解决方案2
0 2022-03-23 18:46:49

解决方案3
0 2022-03-23 18:59:02