[英]Assign unique id to consecutive rows within a grouping variable in dplyr
假設我有以下data.frame:
a <- data.frame(group = "A", value = rnorm(mean = 1, sd = 2, n = 150))
b <- data.frame(group = "B", value = rnorm(mean = 1, sd = 2, n = 150))
c <- data.frame(group = "C", value = rnorm(mean = 1, sd = 2, n = 150))
df <- bind_rows(a, b, c)
我想為分組變量( group
)中的每對連續行創建一個唯一 ID,例如:
df %>% group_by(group) %>% mutate(...)
所以組內的每個“二元組”都應該有一個唯一的 ID
有任何想法嗎?
我們可以使用gl
library(dplyr)
df <- df %>%
group_by(group) %>%
mutate(id = as.integer(gl(n(), 2, n()))) %>%
ungroup
另一個dplyr
選項使用ceiling
+ row_number()
df %>%
group_by(group) %>%
mutate(id = ceiling(row_number() / 2)) %>%
ungroup()
另一種選擇是使用rep
function:
df %>%
group_by(group) %>%
mutate(id = rep(seq(n()), each = 2, length = n())) %>%
ungroup()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.