[英]How to automate operations between each columns of two matrices using R?
I have general function written for MAPE (mean absolute percentage error) 我为MAPE编写了一般功能(平均绝对百分比误差)
mape <- function(y, yhat)
mean(abs((y - yhat)/y))
I want to calculate MAPE between each column of two different matrices. 我想计算两个不同矩阵的每一列之间的MAPE。 Suppose they are in following format 假设它们采用以下格式
y = matrix(c(11, 12, 12, 12, 14, 16, 23, 21, 28),byrow=TRUE,ncol=3)
and predicted as 并预测为
yp = matrix(c(12, 13, 14, 12, 15, 17, 24, 22, 28),byrow=TRUE,ncol=3)
This can be manually done for each column as mape(y[,1],yp[,1])
可以针对每一列以mape(y[,1],yp[,1])
手动完成此操作
How do I automate such process (any other operation also - not only MAPE) of performing operations between each columns of large dimension matrix using R? 如何使用R自动执行在大尺寸矩阵的各列之间执行操作的过程(不仅是MAPE,还需要其他任何操作)? can FOR loops be avoided using apply/sapply? 可以使用apply / sapply避免FOR循环吗?
Of course mape can be vectorised: 当然mape可以向量化:
mapeVec <- function(y, yhat)
colMeans(abs((y-yhat)/y))
f3 <- function() { mapeVec(y, yp) }
Unit: milliseconds
expr min lq mean median uq max neval cld
f1() 33.677431 34.121107 35.494355 34.441823 35.078125 46.16782 100 b
f2() 33.558224 33.970123 35.609414 34.239525 34.881354 49.99195 100 b
f3() 8.344952 8.525146 9.218695 8.568763 8.709681 17.82791 100 a
identical(f1(), f3()) # TRUE
sapply
with a sequence seq(nrow(y))
should do the trick: sapply
与序列seq(nrow(y))
应该做的伎俩:
mape <- function(y, yhat)
mean(abs((y - yhat)/y))
y <- matrix(c(11, 12, 12, 12, 14, 16, 23, 21, 28), nrow = 3, ncol = 3)
yp = matrix(c(12, 13, 14, 12, 15, 17, 24, 22, 28), nrow = 3, ncol = 3)
sapply(seq(nrow(y)), function(id) { mape(y[,id], yp[,id]) })
library(microbenchmark)
mape <- function(y, yhat)
mean(abs((y - yhat)/y))
y <- matrix(rnorm(1000000), nrow = 1000, ncol = 1000)
yp = matrix(rnorm(1000000), nrow = 1000, ncol = 1000)
f1 <- function() { sapply(seq(nrow(y)), function(id) { mape(y[,id], yp[,id]) }) }
f2 <- function() {
a <- vector(mode = "numeric", length = nrow(y))
for(id in seq(nrow(y))) {
a[id] <- mape(y[,id], yp[,id])
}
a
}
microbenchmark(
f1(),
f2()
)
Results: 结果:
Unit: milliseconds
expr min lq mean median uq max neval cld
f1() 33.28310 34.15209 36.57389 35.42845 36.20803 48.11936 100 a
f2() 34.14755 34.78859 37.65782 36.33395 37.06874 64.10664 100 a
f1
( sapply()
approach) looks much more compact and "clean". f1
( sapply()
方法)看起来更加紧凑和“干净”。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.