Optimizing a loop

Question

I am implementing an example given in the book The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Hashtle, Tibshirani, Friedman).

My aim is to generate 10+10 means from two bivariate normal distributions, then use the first ten means to generate points labelled "Green" and the other ten means to generate "Red" points. The mean value of the bivariate gaussian from which a point must be generated has to be picked randomly every time. I am not too familiar with R, so I used the for-loop, using which it takes an awful lot of time as n gets bigger. Here's my code:

Sigma = diag(2)
greenMeans= mvrnorm(n=10, c(1,0), Sigma)
redMeans= mvrnorm(n=10, c(0,1), Sigma)

n=1000000
green<- array(dim=c(n,2))
red<- array(dim=c(n,2))

for (i in 1:n)
        {
            newGreen<- mvrnorm(n=1,greenMeans[sample(c(1:10),1,replace=TRUE),], Sigma/5)
            newRed<- mvrnorm(n=1,redMeans[sample(c(1:10),1,replace=TRUE),], Sigma/5)
            green[i,1] <- newGreen[1]
            green[i,2] <- newGreen[2]
            red[i,1] <- newRed[1]
            red[i,2] <- newRed[2]
    }

Answer 1

You can skip the for loop entirely and use replicate , not sure how much faster it is though:

do_stuff = function() {
   newGreen<- mvrnorm(n=1,greenMeans[sample(c(1:10),1,replace=TRUE),], Sigma/5)
   newRed<- mvrnorm(n=1,redMeans[sample(c(1:10),1,replace=TRUE),], Sigma/5)         
   return(list(newGreen, newRed))
 }
replicate(10000, do_stuff)

Optimizing a loop

Question

1 answers

solution1
0 2013-02-10 11:27:17

Optimizing a loop

Question

1 answers

solution1 0 2013-02-10 11:27:17

solution1
0 2013-02-10 11:27:17