简体   繁体   English

根据该列中的滞后值改变新列 - dplyr 方法

[英]Mutate a new column based on lagged values within that column - dplyr approach

A base approach and dplyr were detailed here How to create a column which use its own lag value using dplyr此处详细介绍了基本方法和 dplyr How to create a column which using its own Lag value using dplyr

I want the first row to equal k, and then every row subsequent to be the lag of "c" plus "a" minus "b".我希望第一行等于 k,然后每一行都是“c”加“a”减去“b”的滞后。

The base approach works.基本方法有效。

But the dplyr approach does not produce the same result as the base approach.但是 dplyr 方法不会产生与基本方法相同的结果。 See:看:

library(tidyverse)
k <- 10 # Set a k value
df1 <- tribble(
  ~a, ~b,
  1,  1,
  1,  2,
  1,  3,
  1,  4,
  1,  5,)
# Base approach
df1$c <- df1$a - df1$b
df1[1, "c"] <- k
df1$c <- cumsum(df1$c)
df1
#> # A tibble: 5 x 3
#>       a     b     c
#>   <dbl> <dbl> <dbl>
#> 1     1     1    10
#> 2     1     2     9
#> 3     1     3     7
#> 4     1     4     4
#> 5     1     5     0
# New df
df2 <- tribble(
  ~a, ~b,
  1,  1,
  1,  2,
  1,  3,
  1,  4,
  1,  5,)
# dplyr approach
df2 %>% 
  mutate(c = lag(cumsum(a - b), 
                 default = k))
#> # A tibble: 5 x 3
#>       a     b     c
#>   <dbl> <dbl> <dbl>
#> 1     1     1    10
#> 2     1     2     0
#> 3     1     3    -1
#> 4     1     4    -3
#> 5     1     5    -6
# Gives two different dataframes

Created on 2020-03-05 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2020 年 3 月 5 日创建

Alternative code and desired output:替代代码和所需的输出:

library(tidyverse)
# Desired output
tribble(
  ~a, ~b, ~c,
  1, 1, 10,
  1, 2, 9,
  1, 3, 7,
  1, 4, 4,
  1, 5, 0)
#> # A tibble: 5 x 3
#>       a     b     c
#>   <dbl> <dbl> <dbl>
#> 1     1     1    10
#> 2     1     2     9
#> 3     1     3     7
#> 4     1     4     4
#> 5     1     5     0
df2 <- tribble(
  ~a, ~b,
  1,  1,
  1,  2,
  1,  3,
  1,  4,
  1,  5,)
k <- 10
df2 %>% 
  mutate(c = case_when(
    row_number() == 1 ~ k,
    row_number() != 1 ~ lag(c) + a - b))
#> Error in x[seq_len(xlen - n)]: object of type 'builtin' is not subsettable

Created on 2020-03-05 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2020 年 3 月 5 日创建

Is there another tidyverse approach that provides the output of the base approach?是否有另一种 tidyverse 方法可以提供基本方法的输出?

We can do :我们可以做的 :

library(dplyr)
df2 %>%  mutate(c = k + cumsum(a-b))

# A tibble: 5 x 3
#      a     b     c
#  <dbl> <dbl> <dbl>
#1     1     1    10
#2     1     2     9
#3     1     3     7
#4     1     4     4
#5     1     5     0

when the first value of a - b is not equal to 0, we can use :a - b的第一个值不等于 0 时,我们可以使用:

df2 %>%  mutate(c = c(k, k + cumsum(a-b)[-1]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM