使用 dplyr::lag 计算行之间的差异后保留第一行

Question

我的问题类似于这个OP和这个OP ，有一个似乎过于复杂的细微差别。

我的数据示例：

ind_id   wt   date
1002     25   1987-07-27
1002     15   1988-05-05
2340     30   1987-03-18
2340     52   1989-08-15

我正在计算group_by(ind_id)之后的wt值之间的差异。

去做这个：

df<-df %>% 
    group_by(ind_id) %>%
    mutate(mass_diff=(wt-lag(wt))

这给了我这个 output：

ind_id   wt   date        mass_diff
1002     15   1988-05-05  -10
2340     52   1989-08-15  22

但是，我想要的 output 应该保留第一个wt记录，而不是最后一个。

所需的 output：

ind_id   wt   date        mass_diff
1002     25   1988-05-05  -10
2340     30   1989-08-15  22

请注意， wt列是我希望从第一行开始维护的唯一列。 （请记住，此示例过于简化，我实际上正在使用 18 行）。

任何建议（使用dplyr ）将不胜感激！

Answer 1

一个可能的解决方案：

library(tidyverse)

df <- structure(list(ind_id = c(1002, 1002, 2340, 2340), wt = c(25, 
15, 30, 52), date = structure(c(6416, 6699, 6285, 7166), class = "Date")), row.names = c(NA, 
-4L), class = "data.frame")

df %>% 
  group_by(ind_id) %>%
  mutate(mass_diff = (wt-lag(wt))) %>% 
  fill(mass_diff, .direction = "up") %>% 
  mutate(date = last(date)) %>% 
  slice_head %>% ungroup

#> # A tibble: 2 × 4
#>   ind_id    wt date       mass_diff
#>    <dbl> <dbl> <date>         <dbl>
#> 1   1002    25 1988-05-05       -10
#> 2   2340    30 1989-08-15        22

使用 dplyr::lag 计算行之间的差异后保留第一行

问题描述

1 个解决方案

解决方案1
0 2022-01-31 20:19:33

使用 dplyr::lag 计算行之间的差异后保留第一行

问题描述

1 个解决方案

解决方案1 0 2022-01-31 20:19:33

解决方案1
0 2022-01-31 20:19:33