[英]R Looping through two vectors
Good day,再会,
I need a function that creates increasing ID's for two parameters.我需要一个 function 来为两个参数创建递增的 ID。 I came up with this function which works fine, but I want it to be vectorized and I cannot seem to avoid a Big O factor of N².我想出了这个 function 工作正常,但我希望它被矢量化,我似乎无法避免 N² 的大 O 因子。 Are there any 'better' ways to do this?有没有“更好”的方法来做到这一点?
Standard function:标准 function:
threshold <- 3
calculateID <- function(p, r) {
return((p-1) * threshold + r)
}
calculateID(1, 1) #returns 1
calculateID(1, 2) #returns 2
calculateID(1, 3) #returns 3
calculateID(2, 1) #returns 4
#.....
calculateID(5, 3) #returns 15
Vectorized function, I would like to give the two parameters as vectors so the function only has to be called once:矢量化 function,我想将这两个参数作为向量给出,因此 function 只需调用一次:
threshold <- 3
calculateIDVectorized <- function(p, r) {
return(unlist(
lapply(p, function(x) {
lapply(r, function(y) {
(x-1) * threshold + y
})
})
))
}
calculateIDVectorized(c(1, 2, 3, 4, 5), c(1, 2, 3)) # should return 1-15
To clarify: I want that every p and r argument is used so you should always get a result of length(p * r)澄清一下:我希望每个 p 和 r 参数都被使用,所以你应该总是得到长度的结果(p * r)
You can use outer
:您可以使用outer
:
calculateIDVectorized <- function(p, r) as.vector(t(outer(p, r, calculateID)))
calculateIDVectorized(c(1, 2, 3, 4, 5), c(1, 2, 3))
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Another base R option using do.call
+ Vectorize
+ expand.grid
另一个使用do.call
+ Vectorize
+ expand.grid
基本 R 选项
> do.call(Vectorize(calculateID),unname(rev(expand.grid(r,p))))
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Data数据
p <- c(1, 2, 3, 4, 5)
r <- c(1, 2, 3)
Since the OP was interested in fast computation, I compared the solutions:由于 OP 对快速计算感兴趣,我比较了解决方案:
library(microbenchmark)
p <- c(1:500) # using larger data set
r <- c(1:20)
threshhold = length(r) # parameterizing threshold
m = microbenchmark(
tidy= crossing(p, r) %>%
rowwise %>%
transmute(out = calculateID(p, r)) %>%
pull(out),
dcv = do.call(Vectorize(calculateID),unname(rev(expand.grid(r,p)))),
numbering = rev(expand.grid(r,p)) %>%
arrange(Var2, Var1) %>%
transmute(out = row_number()) %>%
pull(out),
hybrid = rev(expand.grid(r,p)) %>%
rowwise() %>%
transmute(out = calculateID(Var2, Var1)) %>%
pull(out),
outer = as.vector(t(outer(p, r, calculateID))),
outer_c = c(t(outer(p, r, calculateID))),
david = rep((p - 1), each = length(r)) * threshold + r
)
m
# Unit: microseconds
# expr min lq mean median uq max neval
# tidy 45441.869 47370.776 52123.6770 49482.1970 54158.4285 116780.840 100
# dcv 16259.935 17156.225 19641.6731 17897.8885 21576.0865 55489.586 100
# numbering 5947.147 6379.337 7127.5125 6576.3560 6952.3205 12005.854 100
# hybrid 44124.099 45856.210 51531.9480 47642.5405 52225.0600 175778.380 100
# outer 106.655 120.711 141.1137 128.9665 143.2465 265.072 100
# outer_c 117.811 137.446 152.5958 142.1315 155.9650 327.101 100
# david 223.125 230.711 257.5622 241.8675 260.6100 920.164 100
So it looks like the options using outer()
are fastest with as.vector()
edging out c()
.所以看起来使用outer()
的选项在as.vector()
c()
时最快。 @DavidArenburg's solution is also right up with the solutions using outer()
. @DavidArenburg 的解决方案也与使用outer()
的解决方案一致。
I added a hybrid option using dplyr::transmute()
because rev(expand.grid())
was significantly faster thatn crossing()
, which appears to be marginally faster than the straight dplyr route, but still not as fast as the do.call(Vectorize... or the others.我使用dplyr::transmute()
transmute() 添加了一个混合选项,因为rev(expand.grid())
比crossing()
快得多,这似乎比直线 dplyr 路线快一点,但仍然没有那么快。 call(Vectorize... 或其他。
another option (added above) would be to arrange the data frame and create id's using dplyr::row_number() or 1:nrow().另一个选项(上面添加)是排列数据框并使用 dplyr::row_number() 或 1:nrow() 创建 id。 This option would work if all the combinations for p and r are present and unique, but would fail with non-sequential values.如果 p 和 r 的所有组合都存在且唯一,则此选项将起作用,但会因非连续值而失败。
An option with tidyverse
tidyverse
的一个选项
library(dplyr)
library(tidyr)
crossing(p, r) %>%
rowwise %>%
transmute(out = calculateID(p, r)) %>%
pull(out)
#[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.