dplyr::mutate: temporary expensive variable as input to several other operations, rowwise

Question

It is a little tricky to show my problem with real data but I hope the following explains:

data_frame(a=c(1,2), b=c(3,4)) %>% 
rowwise %>% 
mutate(c = a*b, d = c-1, e=c+2) %>% 
ungroup

In the above example of course the rowwise is not needed.

Now lets suppose that the calculation to make c is both time consuming, c is a large object and not vectorized. So you don't want to have to execute it twice and you want it to be cleared from the memory after each row calculation happens.

Is there a clever way to do this? Perhaps with purrr::map ?

Answer 1

Here is an answer using purrr s invoke_rows .

library(purrr)

MyDf<-data.frame(a=c(1,2), b=c(3,4))
invoke_rows(.d=MyDf, .f=function(a,b){c=a*b
c(d=c-1,
e=c+2)},
.collate="cols")

Update

In response to the comment of @JanStanstrup, if you have another column that you want as part of the output but does not appear in the calculation, you can do this:

MyDf<-data.frame(a=c(1,2), b=c(3,4), dummy=c(6,7))
invoke_rows(.d=MyDf, .f=function(a,b,...){c=a*b
c(d=c-1,
  e=c+2)},
.collate="cols")

Here, dummy and any other columns are passed via the ... as an argument to the .f function, but are not used in that function, so they just gets passed on along.

dplyr::mutate: temporary expensive variable as input to several other operations, rowwise

Question

1 answers

solution1
3 ACCPTED 2016-12-06 16:35:15

dplyr::mutate: temporary expensive variable as input to several other operations, rowwise

Question

1 answers

solution1 3 ACCPTED 2016-12-06 16:35:15

solution1
3 ACCPTED 2016-12-06 16:35:15