简体   繁体   中英

Replace certain values in a matrix with values from another matrix in R

I am a beginner in programming. Thus, this problem might be easy to solve. I have three matrices of the same dimension, for instance:

A = matrix(1:16,4,4)
B = matrix(rnorm(16,5,1),4,4)
C = matrix(rnorm(16,9,1),4,4)

I want to produce a new matrix (D) which contains all value of B at the positions when values of A are lower 8. Otherwise, when values of A are equal or larger 8 the new matrix D should contain the values matrix C. I already solved the problem using the “ifelse”-function:

D = ifelse(A<8,B,C)

However, this is very slow. Is there a faster way producing this matrix D? Many thanks in advance!

Try this:

D <- (A < 8) * B + (A >= 8) * C

It's a little faster:

A = matrix(sample(16,1e4,TRUE),100,100)
B = matrix(rnorm(1e4,5,1),100,100)
C = matrix(rnorm(1e4,9,1),100,100)

require(microbenchmark)

microbenchmark(D1 <- (A < 8) * B + (A >= 8) * C, D2 <- ifelse(A<8,B,C))

Unit: microseconds
                             expr      min        lq    median       uq      max neval
 D1 <- (A < 8) * B + (A >= 8) * C  499.102  528.4075  542.2415  554.983  674.206   100
        D2 <- ifelse(A < 8, B, C) 4015.024 4062.5310 4079.4590 4173.564 5512.694   100

identical(D1,D2)

[1] TRUE

EDIT: It can get even faster with this:

D <- {A < 8} * {B - C} + C

Note curly braces instead of parentheses and a single comparison. Benchmarking:

microbenchmark(D1 <- {A < 8} * {B - C} + C, D2 <- ifelse(A<8,B,C))

Unit: microseconds
                                    expr      min       lq    median       uq      max neval
 D1 <- {     A < 8 } * {     B - C } + C  289.050  300.881  310.7105  333.645  496.189   100
               D2 <- ifelse(A < 8, B, C) 4027.037 4057.980 4069.8110 4148.627 5372.173   100

sum(abs(D1-D2))
#[1] 8.304468e-14

But there is a little impact regarding numerical accuracy, given the subtraction and addition of C .

Here are a few ways to do it for large arrays:

A <- matrix(1:16,10000,10000)
B <- matrix(rnorm(10000^2),10000,10000)
C <- matrix(rnorm(10000^2),10000,10000)

> invisible(gc())
> system.time(D<-ifelse(A<8,B,C))
   user  system elapsed 
 15.588   6.608  22.237 
> invisible(gc())
> system.time(D<- (A<8)*B+(A>=8)*C)
   user  system elapsed 
  3.104   3.152   6.267 
> invisible(gc())
> system.time({D<-B; w<-which(A>=8); D[w]<-C[w]})
   user  system elapsed 
  2.872   1.416   4.296 
> invisible(gc())
> system.time({D<-B; w<-(A>=8); D[w]<-C[w]})
   user  system elapsed 
  4.200   1.788   5.998 
> invisible(gc())
> system.time(D<- {A<8}*{B-C}+C)
   user  system elapsed 
  2.012   1.996   4.018 
> 

So, on my machine at least, the fastest exact method is {D<-B; w<-which(A>=8); D[w]<-C[w]} {D<-B; w<-which(A>=8); D[w]<-C[w]} {D<-B; w<-which(A>=8); D[w]<-C[w]} . The method D<- {A<8}*{BC}+C proposed by Ferdinand.kraft is slightly faster but sacrifices some precision.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM