简体   繁体   English

使用基于另一列的值为每个组添加固定行数

[英]Add fixed number of rows for each group with values based on another column

I have a large dataframe containing IDs and a start date of intervention for each ID: 我有一个包含ID的大型数据框和每个ID的干预开始日期:

  ID Date
1 1  17228
2 2  17226
3 3  17230

And I would like to add 2 rows to each ID with subsequent dates as the values in those rows: 我想为每个ID添加2行,后续日期作为这些行中的值:

  ID Date
1 1  17228
2 1  17229
3 1  17230
4 2  17226
5 2  17227
6 2  17228
7 3  17230
8 3  17231
9 3  17232

Is there any way using dplyr if possible? 如果可能,有没有办法使用dplyr? Other ways are also fine! 其他方式也没关系!

We expand the data by uncount ing, then grouped by 'ID', get the seq uence from the first 'Date' to the number of rows ( n() ) while incrementing by 1 我们通过扩大数据uncount ING,然后通过“ID”进行分组,得到seq从uence first “日期”,以行数( n()而增加by 1

library(tidyverse)
df1 %>%
  uncount(3) %>% 
  group_by(ID) %>% 
  mutate(Date = seq(Date[1], length.out = n(), by = 1))
# A tibble: 9 x 2
# Groups:   ID [3]
#     ID  Date
#  <int> <dbl>
#1     1 17228
#2     1 17229
#3     1 17230
#4     2 17226
#5     2 17227
#6     2 17228
#7     3 17230
#8     3 17231
#9     3 17232

Or another option is unnest a list column 或者另一种选择是unnest一个list

df1 %>%
   group_by(ID) %>% 
   mutate(Date = list(Date[1] + 0:2)) %>% 
   unnest

Or with complete 或者complete

df1 %>%
   group_by(ID) %>%
   complete(Date = first(Date) + 0:2)

Or using base R (pasteing from the comments) 或者使用base R (从评论中粘贴)

within(df1[rep(seq_len(nrow(df1)), each = 3),], Date <- Date + 0:2)

Or more compactly in data.table 或者更紧凑的data.table

library(data.table)
setDT(df1)[, .(Date = Date  + 0:2), ID]
do.call(rbind, lapply(split(d, d$ID), function(x){
    rbind(x, data.frame(ID = rep(tail(x$ID, 1), 2),
                        Date = tail(x$Date, 1) + 1:2))
}))
#     ID  Date
#1.1   1 17228
#1.11  1 17229
#1.2   1 17230
#2.2   2 17226
#2.1   2 17227
#2.21  2 17228
#3.3   3 17230
#3.1   3 17231
#3.2   3 17232

Data 数据

d = structure(list(ID = 1:3, Date = c(17228L, 17226L, 17230L)),
              class = "data.frame",
              row.names = c("1", "2", "3"))

Using dplyr , we can repeat every row 3 times, group_by ID and increment every date from 0 to n() - 1 for each ID . 使用dplyr ,我们可以重复每一行3次, group_by ID并为每个ID增加从0到n() - 1每个日期。

library(dplyr)

df %>%
  slice(rep(seq_len(n()), each = 3)) %>%
  group_by(ID) %>%
  mutate(Date = Date + 0: (n() - 1))

#    ID  Date
#  <int> <int>
#1     1 17228
#2     1 17229
#3     1 17230
#4     2 17226
#5     2 17227
#6     2 17228
#7     3 17230
#8     3 17231
#9     3 17232

A base R one-liner using the same logic above would be 使用上述相同逻辑的基本R单线程将是

transform(df[rep(seq_len(nrow(df)), each = 3),], Date = Date + 0:2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为另一列中的每个组设置固定数量的值的子集 - subset a fixed number of values for each group in another column 如何基于另一个变量的行上的值添加一定数量的行 - How to add certain number of rows based on values on the rows of another variable 如何根据R中的分组将单独的列值添加到另一列? - How to add seperate column values to another column based on group by in R? 在特定列上分组并根据另一个列值为每个组选择偶数行 - group by on a particular column and select even rows for each group based on another column value Select 每组中的行基于它们的值 - Select rows in each group based on their values 按特定日期的另一列的每个唯一值对应的列的布尔值数量进行分组和汇总 - Group by and summarise number of boolean values of a column corresponding to each unique value of another column on a specific date 根据另一列的值连接一列的行 - Concatenate rows of a column based on values of another column 如何添加行以使每个组具有相等的行数? - How to add rows so that each group has equal number of rows? 根据行数不同的另一个数据框的值将值分配给一个数据框的列 - Assign values to a column of one data frame based on values of another data frame with different number of rows 根据行数向列添加值 - Adding values to a column based on number of rows
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM