简体   繁体   中英

More efficient function to generate a special matrix in R

I am trying to program an R function to create a special matrix called Ui in this paper (page 5): http://joshuachan.org/papers/Chan-Jeliazkov-2009.pdf

So far I have developed this function, which I suspect could be improved (ie made more efficient):

create_Uj <- function(uj) {
  q <- length(uj) 
  if (q == 1) return(0)
  nr <- q
  nc <- q*(q-1)/2
  Uj <- matrix(0, nrow = nr, ncol = nc)
  for (kk in 2:nr) {
    uj_sub <- uj[1:(kk-1)]
    Uj[kk, 1:(kk*(kk-1)/2)] <- c(rep(0, (kk-1)*(kk-2)/2), uj[1:(kk-1)])
  }
  -Uj
}

create_Uj(1:4)

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    0    0    0    0    0    0
[2,]   -1    0    0    0    0    0
[3,]    0   -1   -2    0    0    0
[4,]    0    0    0   -1   -2   -3

Is there a better way to code this?

Note: I use the subscript j instead of i in the paper

You can get a decent speedup by just avoiding the creation of rep(0, (kk-1)*(kk-2)/2) . The fact that removing this step significantly speeds things up makes me think you're not likely to get much faster without using Rcpp

create_Uj2 <- function(uj) {
  q <- length(uj) 
  if (q == 1) return(0)
  nr <- q
  nc <- q*(q-1)/2
  Uj <- matrix(0, nrow = nr, ncol = nc)
  for (kk in 2:nr) {
    Uj[kk, ((kk-1)*(kk-2)/2 + 1):(kk*(kk-1)/2)] <- uj[1:(kk-1)]
  }
  -Uj
}

all.equal(create_Uj2(1:400), create_Uj(1:400))
# [1] TRUE
microbenchmark(
  create_Uj2(1:400),
  create_Uj(1:400),
  times = 10,
  unit = 'relative'
)
# Unit: relative
#               expr      min       lq     mean   median       uq       max neval
#  create_Uj2(1:400) 1.000000 1.000000 1.000000 1.000000 1.000000 1.0000000    10
#   create_Uj(1:400) 2.070708 1.847242 1.529658 2.028489 1.496275 0.9441935    10

One can also write the function using nested for loops:

create_Uj3 = function(uj){
  nr <- length(uj) 
  if (nr == 1){
    return(0)
  } 
  nc <- nr*(nr-1)/2
  Uj <- matrix(0, nrow = nr, ncol = nc)

  for (kk in 2:nr) {
    for (ll in 1:(kk-1)){
      Uj[kk, ((kk-1)*(kk-2)/2) + ll] <- uj[ll]
    }
  }
  return(-Uj)
}

The Rcpp equivalent:

library(Rcpp)
cppFunction('NumericMatrix create_Uj_rcpp(NumericVector uj) {
  const int nr = uj.size();
  if(nr == 1){
    return 0;
  }
  const int nc = (nr*(nr-1))/2;
  NumericMatrix Uj = NumericMatrix(nr, nc);

  for(int i = 1; i < nr; i++) {
    for(int j = 0; j <= (i - 1); j++){
      Uj(i,(i*(i-1)/2) + j) = -uj[j];
    }
  }
  return Uj;
}')

Benchmarks:

> identical(create_Uj(1:300), create_Uj2(1:300))
[1] TRUE
> identical(create_Uj(1:300), create_Uj3(1:300))
[1] TRUE
> identical(create_Uj(1:300), create_Uj_rcpp(1:300))
[1] TRUE

library(microbenchmark)
microbenchmark(create_Uj(1:300), 
               create_Uj2(1:300), 
               create_Uj3(1:300),
               create_Uj_rcpp(1:300), 
               unit = 'relative')

Unit: relative
                  expr      min       lq     mean   median       uq      max neval
      create_Uj(1:300) 6.299113 6.874493 4.351439 5.128115 4.059644 2.105598   100
     create_Uj2(1:300) 2.859025 3.827864 2.524505 2.873346 2.233600 1.618327   100
     create_Uj3(1:300) 3.078410 4.111635 2.552434 3.109537 2.259824 1.418917   100
 create_Uj_rcpp(1:300) 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000   100

create_Uj_rcpp wins in terms of speed. Note that the Base R nested for loop method ( create_Uj3 ) is a bit slower than Ryan's solution ( create_Uj2 ) but is still a lot faster than OP's function ( create_Uj ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM