[英]Add fixed number of rows for each group with values based on another column
I have a large dataframe containing IDs and a start date of intervention for each ID: 我有一个包含ID的大型数据框和每个ID的干预开始日期:
ID Date
1 1 17228
2 2 17226
3 3 17230
And I would like to add 2 rows to each ID with subsequent dates as the values in those rows: 我想为每个ID添加2行,后续日期作为这些行中的值:
ID Date
1 1 17228
2 1 17229
3 1 17230
4 2 17226
5 2 17227
6 2 17228
7 3 17230
8 3 17231
9 3 17232
Is there any way using dplyr if possible? 如果可能,有没有办法使用dplyr? Other ways are also fine!
其他方式也没关系!
We expand the data by uncount
ing, then grouped by 'ID', get the seq
uence from the first
'Date' to the number of rows ( n()
) while incrementing by
1 我们通过扩大数据
uncount
ING,然后通过“ID”进行分组,得到seq
从uence first
“日期”,以行数( n()
而增加by
1
library(tidyverse)
df1 %>%
uncount(3) %>%
group_by(ID) %>%
mutate(Date = seq(Date[1], length.out = n(), by = 1))
# A tibble: 9 x 2
# Groups: ID [3]
# ID Date
# <int> <dbl>
#1 1 17228
#2 1 17229
#3 1 17230
#4 2 17226
#5 2 17227
#6 2 17228
#7 3 17230
#8 3 17231
#9 3 17232
Or another option is unnest
a list
column 或者另一种选择是
unnest
一个list
列
df1 %>%
group_by(ID) %>%
mutate(Date = list(Date[1] + 0:2)) %>%
unnest
Or with complete
或者
complete
df1 %>%
group_by(ID) %>%
complete(Date = first(Date) + 0:2)
Or using base R
(pasteing from the comments) 或者使用
base R
(从评论中粘贴)
within(df1[rep(seq_len(nrow(df1)), each = 3),], Date <- Date + 0:2)
Or more compactly in data.table
或者更紧凑的
data.table
library(data.table)
setDT(df1)[, .(Date = Date + 0:2), ID]
do.call(rbind, lapply(split(d, d$ID), function(x){
rbind(x, data.frame(ID = rep(tail(x$ID, 1), 2),
Date = tail(x$Date, 1) + 1:2))
}))
# ID Date
#1.1 1 17228
#1.11 1 17229
#1.2 1 17230
#2.2 2 17226
#2.1 2 17227
#2.21 2 17228
#3.3 3 17230
#3.1 3 17231
#3.2 3 17232
Data 数据
d = structure(list(ID = 1:3, Date = c(17228L, 17226L, 17230L)),
class = "data.frame",
row.names = c("1", "2", "3"))
Using dplyr
, we can repeat every row 3 times, group_by
ID
and increment every date from 0 to n() - 1
for each ID
. 使用
dplyr
,我们可以重复每一行3次, group_by
ID
并为每个ID
增加从0到n() - 1
每个日期。
library(dplyr)
df %>%
slice(rep(seq_len(n()), each = 3)) %>%
group_by(ID) %>%
mutate(Date = Date + 0: (n() - 1))
# ID Date
# <int> <int>
#1 1 17228
#2 1 17229
#3 1 17230
#4 2 17226
#5 2 17227
#6 2 17228
#7 3 17230
#8 3 17231
#9 3 17232
A base R one-liner using the same logic above would be 使用上述相同逻辑的基本R单线程将是
transform(df[rep(seq_len(nrow(df)), each = 3),], Date = Date + 0:2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.