[英]Assign a group number based on another column by group in R
This is probably very straight forward, but I can't figure out a way to do this. 这可能非常简单,但我无法想出办法。 I have some data that looks like this: 我有一些看起来像这样的数据:
domain difference
xxxx 0
xxxx 2
xxxx 14
xxxx 3
xxxx 7
xxxx 2
yyyy 6
yyyy 5
yyyy 13
yyyy 10
zzzz 2
zzzz 5
zzzz 1
zzzz 15
zzzz 16
zzzz 8
zzzz 9
I want it to look like this: 我希望它看起来像这样:
domain difference grp
xxxx 0 1
xxxx 2 1
xxxx 14 2
xxxx 3 2
xxxx 7 2
xxxx 2 2
yyyy 6 1
yyyy 5 1
yyyy 13 1
yyyy 10 1
zzzz 2 1
zzzz 5 1
zzzz 1 1
zzzz 15 2
zzzz 16 3
zzzz 8 3
zzzz 9 3
So basically by domain I want to assign a group number to several rows if the difference is greater than or equal to 14. When there is a difference greater than or equal to 14, assign a group number to the previous rows. 因此,基本上通过域我想要将组编号分配给多行,如果差异大于或等于14.当存在大于或等于14的差异时,将组编号分配给前面的行。
I've tried using a nested for loop, where the domains are levels but I feel like that may be unnecessarily complex, and I'm not sure how to tell the loop to keep going and pick up where it left off after assigning the first group number. 我已经尝试过使用嵌套for循环,其中域是级别但我觉得这可能是不必要的复杂,我不知道如何告诉循环继续前进并在分配第一个后继续它停止的地方组号。 Here's what I have so far: 这是我到目前为止所拥有的:
lev <- levels(e_won$domain)
lev <- levels(e_won$domain)
for (i in 1:length(lev)) {
for (j in 1:nrow(lev)){
if (difference[j] >= 14) {
grp[1:j] = 1
}
I'm completely open to a non-loop solution, but that's just what I thought at first. 我对非循环解决方案完全开放,但这正是我最初的想法。
You can try 你可以试试
library(data.table)
setDT(df1)[, grp:=cumsum(difference>=14)+1L, by=domain][]
# domain difference grp
#1: xxxx 0 1
#2: xxxx 2 1
#3: xxxx 14 2
#4: xxxx 3 2
#5: xxxx 7 2
#6: xxxx 2 2
#7: yyyy 6 1
#8: yyyy 5 1
#9: yyyy 13 1
#10: yyyy 10 1
#11: zzzz 2 1
#12: zzzz 5 1
#13: zzzz 1 1
#14: zzzz 15 2
#15: zzzz 16 3
#16: zzzz 8 3
#17: zzzz 9 3
Or using dplyr
或者使用dplyr
df1 %>%
group_by(domain) %>%
mutate(grp= cumsum(difference >=14)+1L)
Or using base R
(from @Colonel Beauvel's comments) 或使用base R
(来自@Colonel Beauvel的评论)
df1$grp <- with(df1, ave(difference>=14, domain, FUN=cumsum)) + 1L
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.