简体   繁体   中英

R dplyr: Mutate with Reduce with init, after group_by

Is is possible to specify an initial value for Reduce without it being added into the dataframe?

For example, with function:

f <- function(x, y) if (y<0) -x * y else x + y

Acting on data frame:

    set.seed(0)
    df <- c(-0.9, sample(c(-0.9, 1:3), 9, replace = TRUE)) %>% tibble()
    names(df) <- "x"
    df <- df %>% mutate(id = 'a')
    df$id[6:10] <- 'b'
    df <- df %>% group_by(id) %>% mutate(sumprod = Reduce(f, x, acc=TRUE)) %>% ungroup()
    df$target <- c(0, 3, 4, 5, 7, 3, 2.7, 5.7, 8.7, 10.7)
    df

# A tibble: 10 x 4
       x    id sumprod target
   <dbl> <chr>   <dbl>  <dbl>
 1  -0.9     a    -0.9    0.0
 2   3.0     a     2.1    3.0
 3   1.0     a     3.1    4.0
 4   1.0     a     4.1    5.0
 5   2.0     a     6.1    7.0
 6   3.0     b     3.0    3.0
 7  -0.9     b     2.7    2.7
 8   3.0     b     5.7    5.7
 9   3.0     b     8.7    8.7
10   2.0     b    10.7   10.7

The goal is column target . I've tried using init with Reduce, however that adds an extra element.

Reduce(f, df$x[1:5], acc=TRUE, init=0)
 [1]  0  0  3  4  5  7

Using this within mutate produces an error:

> df <- df %>% group_by(id) %>% mutate(sumprod = Reduce(f, x, acc=TRUE, init=0)) %>% ungroup()
Error in mutate_impl(.data, dots) : 
  Column `sumprod` must be length 5 (the group size) or one, not 6

If init is given, Reduce logically adds it to the start (when proceeding left to right) or the end of x, respectively. If you don't need the element, you can use tail(..., -1) to remove the first element:

df %>% 
    group_by(id) %>% 
    mutate(sumprod = tail(Reduce(f, x, acc=TRUE, init=0), -1)) %>% 
    ungroup()

# A tibble: 10 x 4
#       x    id sumprod target
#   <dbl> <chr>   <dbl>  <dbl>
# 1  -0.9     a     0.0    0.0
# 2   3.0     a     3.0    3.0
# 3   1.0     a     4.0    4.0
# 4   1.0     a     5.0    5.0
# 5   2.0     a     7.0    7.0
# 6   3.0     b     3.0    3.0
# 7  -0.9     b     2.7    2.7
# 8   3.0     b     5.7    5.7
# 9   3.0     b     8.7    8.7
#10   2.0     b    10.7   10.7

With tidyverse , there is accumulate from purrr

library(tidyverse)
df %>%
   group_by(id) %>%
   mutate(sumprod = accumulate(.x = x, .f = f, .init = 0)[-1]) %>%
   ungroup
# A tibble: 10 x 3
#       x    id sumprod
#   <dbl> <chr>   <dbl>
# 1  -0.9     a     0.0
# 2   3.0     a     3.0
# 3   1.0     a     4.0
# 4   1.0     a     5.0
# 5   2.0     a     7.0
# 6   3.0     b     3.0
# 7  -0.9     b     2.7
# 8   3.0     b     5.7
# 9   3.0     b     8.7
#10   2.0     b    10.7

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM