在分组的 dplyr 数据框中迭代应用函数以在 R 中创建列

Question

Suppose I'm given the following input dataframe:假设我得到以下输入数据框：

ID  Date
1   20th May, 2020
1   21st May, 2020
1   28th May, 2020
1   29th May, 2020
2   20th May, 2020
2   1st June, 2020

I want to generate the following dataframe:我想生成以下数据框：

ID  Date            Delta
1   20th May, 2020      0
1   21st May, 2020      1
1   28th May, 2020      7
1   29th May, 2020      1
2   20th May, 2020      0
2   1st June, 2020     12

Where the idea is, first I group by id .这个想法在哪里，首先我按id分组。 Then within my current id .然后在我当前的id 。 I iterate over the days and subtract the current date with the previous date with the exception of the first date which is just itself.我迭代这些天并将当前日期与前一个日期相减，第一个日期除外，它只是它本身。

I have been using dplyr but I am uncertain on how to achieve this for groups and how to do this iteratively我一直在使用 dplyr，但我不确定如何为团体实现这一目标以及如何迭代地做到这一点

My goal is to filter the deltas and retain 0 and anything larger than 7 but it must follow the 'preceeding date' logic within a specific id .我的目标是过滤增量并保留 0 和任何大于 7 的值，但它必须遵循特定id的“前一天”逻辑。

Answer 1

library(dplyr)
dat %>%
  mutate(Date = as.Date(gsub("[a-z]{2} ", " ", Date), format = "%d %b, %Y")) %>%
  group_by(ID) %>%
  mutate(Delta = c(0, diff(Date))) %>%
  ungroup()
# # A tibble: 6 x 3
#      ID Date       Delta
#   <dbl> <date>     <dbl>
# 1     1 2020-05-20     0
# 2     1 2020-05-21     1
# 3     1 2020-05-28     7
# 4     1 2020-05-29     1
# 5     2 2020-05-20     0
# 6     2 2020-06-01    12

Steps:脚步：

remove the ordinal from numbers, so that we can从数字中删除序数，这样我们就可以
convert them to proper Date -class objects, then将它们转换为正确的Date类对象，然后
diff them within ID groups. diff他们内ID组。

Data数据

dat <- structure(list(ID = c(1, 1, 1, 1, 2, 2), Date = c("  20th May, 2020", "  21st May, 2020", "  28th May, 2020", "  29th May, 2020", "  20th May, 2020", "  1st June, 2020")), class = "data.frame", row.names = c(NA, -6L))

Answer 2

Similar logic as @r2evans but with different functions.与@r2evans 类似的逻辑，但具有不同的功能。

library(dplyr)
library(lubridate)

df %>%
  mutate(Date = dmy(Date)) %>%
  group_by(ID) %>%
  mutate(Delta = as.integer(Date - lag(Date, default = first(Date)))) %>%
  ungroup

#     ID Date       Delta
#  <int> <date>     <int>
#1     1 2020-05-20     0
#2     1 2020-05-21     1
#3     1 2020-05-28     7
#4     1 2020-05-29     1
#5     2 2020-05-20     0
#6     2 2020-06-01    12

data数据

df <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L), Date = c("20th May, 2020", 
"21st May, 2020", "28th May, 2020", "29th May, 2020", "20th May, 2020", 
"1st June, 2020")), class = "data.frame", row.names = c(NA, -6L))

在分组的 dplyr 数据框中迭代应用函数以在 R 中创建列

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-07-01 01:20:21

解决方案2
1 2021-07-01 03:02:27

在分组的 dplyr 数据框中迭代应用函数以在 R 中创建列

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-07-01 01:20:21

解决方案2 1 2021-07-01 03:02:27

解决方案1
1 已采纳 2021-07-01 01:20:21

解决方案2
1 2021-07-01 03:02:27