简体   繁体   中英

Faster way to set up a sparse matrix in R?

I'm trying to set up a particular kind of sparse matrix in R. The following code gives the result I want, but is extremely slow:

library(Matrix)

f <- function(x){

  out <- rbind(head(x, -1), tail(x, -1))
  out <- bdiag(split(out, col(out)))
  return(out)

}#END f 

x <- outer(1:250, (1:5)/10, '+')
do.call(rBind, apply(x, 1, f))

I'll need to do this thousands of times in a simulation study I'm working on so this is a pretty serious bottleneck. The Rprof() output in this case was pretty confusing. I'd appreciate any suggestions you might have for how to speed things up.

Thank you for your time.

This code runs a lot faster (<0.01 seconds, vs 3.36 seconds on my box) because it avoids all of that extremely slow rBind 'ing. The key is to first prepare the row indices, column indices, and values of the non-zero cells. A single call to sparseMatrix(i,j,x) will then construct the sparse matrix without requiring even one call to rBind() .

library(Matrix)
A <- 1:250
B <- (1:5)/10
x <- outer(A, B, '+')

f2 <- function(x){
    n <- length(x)
    rep(x, each=2)[-c(1, 2*n)]
}

system.time({
  val <- as.vector(apply(x,1,f2))
  n <- length(val)
  i <- seq_len(n)
  j <- rep(rep(seq_len(length(B)-1), each=2), length.out=n)
  outVectorized <- sparseMatrix(i = i, j = j, x = val)
})
#    user  system elapsed 
#       0       0       0 

Just to show that the results are the same:

## Your approach
f <- function(x){
    out <- rbind(head(x, -1), tail(x, -1))
    out <- bdiag(split(out, col(out)))
    return(out)
}

system.time(outRBinded <- do.call(rBind, apply(x, 1, f)))
#    user  system elapsed 
#    3.36    0.00    3.36 

identical(outVectorized, outRBinded)
# [1] TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM