[英]Repeating sets of rows according to the number of rows by column in R with data.table
Currently in R, I am trying to do the following for data.table table: 当前在R中,我正在尝试对data.table表执行以下操作:
Suppose my data looks like: 假设我的数据如下所示:
Class Person ID Index
A 1 3
A 2 3
A 5 3
B 7 2
B 12 2
C 18 1
D 25 2
D 44 2
Here, the class refers to the class a person belongs to. 在此,类别是指一个人所属的类别。 The Person ID variable represents a unique identifier of a person.
Person ID变量代表一个人的唯一标识符。 Finally, the Index tells us how many people are in each class.
最后,该指数告诉我们每个班级有多少人。
From this, I would like to create a new data table as so: 由此,我想这样创建一个新的数据表:
Class Person ID Index
A 1 3
A 2 3
A 5 3
A 1 3
A 2 3
A 5 3
A 1 3
A 2 3
A 5 3
B 7 2
B 12 2
B 7 2
B 12 2
C 18 1
D 25 2
D 44 2
D 25 2
D 44 2
where we repeated each set of persons by class based on the index variable. 在这里我们根据索引变量按类重复每组人员。 Hence, we would repeat the class A by 3 times because the index says 3.
因此,我们将A类重复3次,因为索引显示为3。
So far, my code looks like: 到目前为止,我的代码看起来像:
setDT(data)[, list(Class = rep(Person ID, seq_len(.N)), Person ID = sequence(seq_len(.N)), by = Index]
However, I am not getting the correct result and I feel like there is a simpler way to do this. 但是,我没有得到正确的结果,我觉得有一种更简单的方法可以做到这一点。 Would anyone have any ideas?
有人有什么想法吗? Thank you!
谢谢!
If that particular order is important to you, then perhaps something like this should work: 如果该特定顺序对您很重要,那么应该可以执行以下操作:
setDT(data)[, list(PersonID, sequence(rep(.N, Index))), by = list(Class, Index)]
# Class Index PersonID V2
# 1: A 3 1 1
# 2: A 3 2 2
# 3: A 3 5 3
# 4: A 3 1 1
# 5: A 3 2 2
# 6: A 3 5 3
# 7: A 3 1 1
# 8: A 3 2 2
# 9: A 3 5 3
# 10: B 2 7 1
# 11: B 2 12 2
# 12: B 2 7 1
# 13: B 2 12 2
# 14: C 1 18 1
# 15: D 2 25 1
# 16: D 2 44 2
# 17: D 2 25 1
# 18: D 2 44 2
If the order is not important, perhaps: 如果顺序不重要,则可能:
setDT(data)[rep(1:nrow(data), Index)]
Here is a way using dplyr
in case you wanted to try 这是使用
dplyr
的方法,以防您尝试
library(dplyr)
data %>%
group_by(Class) %>%
do(data.frame(.[sequence(.$Index[row(.)[,1]]),]))
which gives the output 这给出了输出
# Class Person.ID Index
#1 A 1 3
#2 A 2 3
#3 A 5 3
#4 A 1 3
#5 A 2 3
#6 A 5 3
#7 A 1 3
#8 A 2 3
#9 A 5 3
#10 B 7 2
#11 B 12 2
#12 B 7 2
#13 B 12 2
#14 C 18 1
#15 D 25 2
#16 D 44 2
#17 D 25 2
#18 D 44 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.