加快轻按R代码

Question

I have 100 matrices which each have 604800 columns, and 101 rows. 我有100个矩阵，每个矩阵有604800列和101行。 For each matrix, I need to reduce the number of columns to 60480 by computing the 10 column averages. 对于每个矩阵，我需要通过计算10个列的平均值将列数减少到60480。

For example, for a vector 例如，对于矢量

c(1,2,3,4,5,6,7,8,9,10,...)

The 5 column average would be: 5列平均值为：

c(3,8,13,18,...)

The code I am using to do this is: 我用于执行此操作的代码是：

col.av = tapply(col, rep(1:(length(col)/10), each = 10), mean)

Where col is one of my 101 x 604800 matrices. 其中col是我的101 x 604800矩阵之一。 I have a for loop which iterates over the 100 matrices, however my problem is in the length of time needed to compute one run. 我有一个for循环，可以循环访问100个矩阵，但是我的问题是计算一次运行所需的时间长短。

If I am just using one matrix, it takes 20 minutes+ to execute which is not feasible. 如果我仅使用一个矩阵，则需要20分钟以上的时间才能执行。 Are there any suggestions on how I can improve the speed of computation? 关于如何提高计算速度有什么建议吗？

Thanks 谢谢

Answer 1

If you are fine with for loop, this one works for your case: 如果您for使用for循环，则此方法适用于您的情况：

col.av <- matrix(0, nrow(col), ncol(col)/10)
for (i in 1:ncol(col.av)) {
  col.av[,i] <- rowMeans(col[,(10*(i-1)+1):(10*i)])
}

Answer 2

Or without a for-loop and a custom function for readability. 或者没有for循环和自定义功能以提高可读性。 You can always wrap this in your for-loop or a call to apply. 您始终可以将其包装在for循环或调用中进行应用。

#generate data
nc=604800 
nr=101
test_m <- matrix(rnorm(nc*nr),ncol=nc)

#function to get rowmeans by 'window'-columns
get_rowmeans <- function(mm, window=10){
  indices <- seq(1,ncol(mm),by=window)
  res <- sapply(indices, function(i){
    return(rowMeans(mm[,i:(i+(window-1))]))
  })
  res
}

tt <- get_rowmeans(test_m)
#check one
> all(tt[,1]==rowMeans(test_m[,1:10]))
[1] TRUE

加快轻按R代码

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-09-15 07:47:56

解决方案2
0 2015-09-15 07:54:56

加快轻按R代码

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-09-15 07:47:56

解决方案2 0 2015-09-15 07:54:56

解决方案1
1 已采纳 2015-09-15 07:47:56

解决方案2
0 2015-09-15 07:54:56