R: A faster alternative to scaleBy

Question

I am using scaleBy from the doBy R package, to standardize a variable by a condition for each subject in my dataset. I have about 5137 participants in my data set, each with about 120 observations. On that dataset, scaleBy is very slow (close to 15 minutes). Other functions (eg, summaryBy, melt, dcast) work much faster (no more than 20 seconds). I wonder whether there are faster simple alternatives for the scaleBy.

Here is a simulation code that you can run to mimic my dataset, in terms of number of participants, number of conditions within each participant (it is a repeated measures design), and number of observations for each condition for each participant:

 nSubj <- 5137 valuesPerSubj <- 120 nobs <- nSubj*valuesPerSubj ttt <- data.frame(cond=rep(c('a','b','c','d'),nobs/4), rt=rnorm(nobs,mean=700,sd=150), subj=rep(seq(1:nSubj),valuesPerSubj)) start <- Sys.time() zt <- scaleBy(rt ~ subj+cond, data=ttt) end <- Sys.time() duration <- end-start duration

The scaleBy in this code takes my computer 11.7 minutes (you can reduce nSubj in the code above for faster testing). Any faster solutions?

Answer 1

I found a much faster code. I replaced the scaleBy line with these two lines:

 gttt <- group_by(ttt,subj,cond) zt <- mutate(gttt,zrt=as.numeric(scale(rt)))

This code took less than 4 seconds to run.

R: A faster alternative to scaleBy

Question

1 answers

solution1
0 2016-07-27 10:09:42

R: A faster alternative to scaleBy

Question

1 answers

solution1 0 2016-07-27 10:09:42

solution1
0 2016-07-27 10:09:42