根据具有data.table的R中按列的行数重复行集

Question

Currently in R, I am trying to do the following for data.table table: 当前在R中，我正在尝试对data.table表执行以下操作：

Suppose my data looks like: 假设我的数据如下所示：

Class   Person ID      Index
A       1              3
A       2              3
A       5              3
B       7              2
B       12             2
C       18             1
D       25             2
D       44             2

Here, the class refers to the class a person belongs to. 在此，类别是指一个人所属的类别。 The Person ID variable represents a unique identifier of a person. Person ID变量代表一个人的唯一标识符。 Finally, the Index tells us how many people are in each class. 最后，该指数告诉我们每个班级有多少人。

From this, I would like to create a new data table as so: 由此，我想这样创建一个新的数据表：

Class   Person ID      Index
A       1              3
A       2              3
A       5              3
A       1              3
A       2              3
A       5              3
A       1              3
A       2              3
A       5              3
B       7              2
B       12             2
B       7              2
B       12             2
C       18             1
D       25             2
D       44             2
D       25             2
D       44             2

where we repeated each set of persons by class based on the index variable. 在这里我们根据索引变量按类重复每组人员。 Hence, we would repeat the class A by 3 times because the index says 3. 因此，我们将A类重复3次，因为索引显示为3。

So far, my code looks like: 到目前为止，我的代码看起来像：

setDT(data)[, list(Class = rep(Person ID, seq_len(.N)), Person ID = sequence(seq_len(.N)), by = Index]

However, I am not getting the correct result and I feel like there is a simpler way to do this. 但是，我没有得到正确的结果，我觉得有一种更简单的方法可以做到这一点。 Would anyone have any ideas? 有人有什么想法吗？ Thank you! 谢谢！

Answer 1

If that particular order is important to you, then perhaps something like this should work: 如果该特定顺序对您很重要，那么应该可以执行以下操作：

setDT(data)[, list(PersonID, sequence(rep(.N, Index))), by = list(Class, Index)]
#     Class Index PersonID V2
#  1:     A     3        1  1
#  2:     A     3        2  2
#  3:     A     3        5  3
#  4:     A     3        1  1
#  5:     A     3        2  2
#  6:     A     3        5  3
#  7:     A     3        1  1
#  8:     A     3        2  2
#  9:     A     3        5  3
# 10:     B     2        7  1
# 11:     B     2       12  2
# 12:     B     2        7  1
# 13:     B     2       12  2
# 14:     C     1       18  1
# 15:     D     2       25  1
# 16:     D     2       44  2
# 17:     D     2       25  1
# 18:     D     2       44  2

If the order is not important, perhaps: 如果顺序不重要，则可能：

setDT(data)[rep(1:nrow(data), Index)]

Answer 2

Here is a way using dplyr in case you wanted to try 这是使用dplyr的方法，以防您尝试

library(dplyr)
data %>%
group_by(Class) %>% 
do(data.frame(.[sequence(.$Index[row(.)[,1]]),]))

which gives the output 这给出了输出

 #      Class Person.ID Index
 #1      A         1     3
 #2      A         2     3
 #3      A         5     3
 #4      A         1     3
 #5      A         2     3
 #6      A         5     3
 #7      A         1     3
 #8      A         2     3
 #9      A         5     3
 #10     B         7     2
 #11     B        12     2
 #12     B         7     2
 #13     B        12     2
 #14     C        18     1
 #15     D        25     2
 #16     D        44     2
 #17     D        25     2
 #18     D        44     2

根据具有data.table的R中按列的行数重复行集

问题描述

2 个解决方案

解决方案1
2 2014-09-13 09:43:10

解决方案2
0 2014-09-13 10:05:43

根据具有data.table的R中按列的行数重复行集

问题描述

2 个解决方案

解决方案1 2 2014-09-13 09:43:10

解决方案2 0 2014-09-13 10:05:43

解决方案1
2 2014-09-13 09:43:10

解决方案2
0 2014-09-13 10:05:43