简体   繁体   中英

Can I speedup my R code with Rcpp?

I defined an R function that contains a matrix, a vector and a parameter a . I need to compute the results of the function for different values of a . This is simple to code in R , but very slow when the matrix is "big" and number of the parameter values are large.

Can I define the function in R and do the for loop in Rcpp ?

Can it speed up the computations?

A minimal example of a foo function in R is

f <- function(X,y,a){
  p = ncol(X)
  res = (crossprod(X) + a*diag(1,p))%*%crossprod(X,y)
  }

set.seed(0)
X <- matrix(rnorm(50*5),50,5)
y <- rnorm(50)
a <- seq(0,1,0.1)

result <- matrix(NA,ncol(X),length(a))

for(i in 1:length(a)){                  # Can I do this part in Rcpp?
  result[,i] <- f(X,y,a[i])
  }

result

I simply suggest avoiding recomputing X'X and X'y in the loop as they are loop-invariant.

f <- function (XtX, Xty, a) (XtX + diag(a, ncol(XtX))) %*% Xty

set.seed(0)
X <- matrix(rnorm(50*5),50,5)
y <- rnorm(50)
a <- seq(0,1,0.1)

result1 <- matrix(NA, ncol(X), length(a))

XtX <- crossprod(X)
Xty <- crossprod(X, y)

for(i in 1:length(a)) {
  result1[,i] <- f(XtX, Xty, a[i])
  }

## compare with your `result`
all.equal(result, result1)
#[1] TRUE

hours later...

When I return I see something more to simplify.

(XtX + diag(a, ncol(XtX))) %*% Xty = XtX %*% Xty + diag(a, ncol(XtX)) %*% Xty
                                   = XtX %*% Xty + a * Xty

So actually, XtX %*% Xty is loop-invariant, too.

f <- function (XtX.Xty, Xty, a) XtX.Xty + a * Xty

set.seed(0)
X <- matrix(rnorm(50*5),50,5)
y <- rnorm(50)
a <- seq(0,1,0.1)

result2 <- matrix(NA, ncol(X), length(a))

XtX <- crossprod(X)
Xty <- c(crossprod(X, y))  ## one-column matrix to vector
XtX.Xty <- c(XtX %*% Xty)  ## one-column matrix to vector

for(i in 1:length(a)) {
  result2[,i] <- f(XtX.Xty, Xty, a[i])
  }

## compare with your `result`
all.equal(result, result2)
#[1] TRUE

And it turns out that we can get rid of the loop:

## "inline" function `f`
for(i in 1:length(a)) {
  result2[,i] <- XtX.Xty + a[i] * Xty
  }

## compare with your `result`
all.equal(result, result2)
#[1] TRUE

## do it with recycling rule
for(i in 1:length(a)) {
  result2[,i] <- a[i] * Xty
  }
result2 <- XtX.Xty + result2

## compare with your `result`
all.equal(result, result2)
#[1] TRUE

## now use `tcrossprod`
result2 <- XtX.Xty + tcrossprod(Xty, a)

## compare with your `result`
all.equal(result, result2)
#[1] TRUE

I completely agree with you that your example code in the question is just a "foo" . And you may not have thought carefully about it when posting it. However, it suffices to show that when writing a loop, either an R loop or a C / C++ / FORTRAN loop, we should always seek to pull those loop-invariant out of the loop to reduce computational complexity.

Your concern is to get speedup with Rcpp (or any compiled language). You want to vectorize a section of R code that is not readily vectorized. However, "%*%" , crossprod and tcrossprod are mapped to BLAS (FORTRAN code) and are not R-level computations. You readily have everything vectorized.

Don't always blame the interpretation overhead ( as R is an interpreted language ) of an R-loop for the poor performance. Such overhead is insignificant if each iteration is doing some "heavy" computations, like big matrix computations (using BLAS) or fitting statistical models (like lm ). In fact, if you do want to write such a loop with a compiled language, use lapply function. This function implements loop at C level, and calls R function for each iteration. Alternatively, Ralf's answer is an Rcpp equivalent. In my opinion, a loop nest written in R code is more likely to be inefficient.

The answer by 李哲源 correctly identifies what should be done in your case. As for your original question the answer is two-fold: Yes, you can move the loop to C++ with Rcpp. No, you won't gain performance:

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericMatrix fillMatrix(Rcpp::NumericMatrix X,
                   Rcpp::NumericVector y,
                   Rcpp::NumericVector a,
                   Rcpp::Function f) {
  Rcpp::NumericMatrix result = Rcpp::no_init(X.cols(), a.length());
  for (int i = 0; i < a.length(); ++i) {
    result(Rcpp::_, i) = Rcpp::as<Rcpp::NumericVector>(f(X, y, a[i]));
  }
  return result;
}

/*** R
f <- function(X,y,a){
  p = ncol(X)
  res = (crossprod(X) + a*diag(1,p))%*%crossprod(X,y)
  }

X <- matrix(rnorm(500*50),500,50)
y <- rnorm(500)
a <- seq(0,1,0.01)

system.time(fillMatrix(X, y, a, f))
#   user  system elapsed 
#  0.052   0.077   0.075 
system.time({result <- matrix(NA,ncol(X),length(a))

for(i in 1:length(a)){
  result[,i] <- f(X,y,a[i])
  }
})
#   user  system elapsed 
#  0.060   0.037   0.049 
*/

So the Rcpp solution is actually slower than the R solution in this case. Why? Because the real work is done within the function f . This is the same for both solutions, but the Rcpp solution has the additional overhead of calling back to R from C++. Note that for loops in R are not necessarily slow . BTW, here some benchmark data:

          expr      min       lq     mean   median       uq      max neval
 fillMatrixR() 41.22305 41.86880 46.16806 45.20537 49.11250 65.03886   100
 fillMatrixC() 44.57131 44.90617 49.76092 50.99102 52.89444 66.82156   100

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM