[英]Is there a way for me to avoid the for loop or make it more efficient?
I want to pick one element from x, one element from y (x and y are mutually exclusive), and one element from x or y that has not already been selected. 我想从x中选择一个元素,从y中选择一个元素(x和y是互斥的),并从x或y中选择一个尚未被选择的元素。 I then want to repeat the process a specified number of times and store the results of each trial in a dataframe. 然后,我想重复该过程指定的次数,并将每个试验的结果存储在一个数据框中。 (note: I am not interested in trying to find every possible combination) (注意:我对尝试找到每种可能的组合都不感兴趣)
The code below works but slows considerably as the number of trials increases. 下面的代码有效,但是随着试用次数的增加,速度会大大降低。
x <- 1:4
y <- 5:8
z <- c(x, y) #edited - previous code read a, b in place of x, y
trials <- 5
sel <- data.frame()
set.seed(123)
for (i in 1:trials){
x_sel <- sample(x, 1)
y_sel <- sample(y, 1)
rem <- z[!(z %in% c(x_sel, y_sel))]
z_sel <- sample(rem, 1)
sel <- rbind(sel, cbind(x_sel, y_sel, z_sel))
}
This should probably be somewhat faster, but I doubt it's the fastest possible. 这可能会更快一些,但我怀疑这可能是最快的。 Certainly Rcpp would be the fastest, I would think. 我认为Rcpp当然是最快的。
> set.seed(123)
> x <- 1:4
> y <- 5:8
> z <- c(x, y)
> trials <- 5
>
> xval <- sample(x,size = trials,replace = TRUE)
> yval <- sample(y,size = trials,replace = TRUE)
> zval <- mapply(FUN = function(x,y,z) {sample(setdiff(z,c(x,y)),1)},
x = xval,
y = yval,
MoreArgs = list(z = z))
>
> result <- data.frame(xval = xval,
yval = yval,
zval = zval)
> result
xval yval zval
1 2 5 8
2 4 7 3
3 2 8 6
4 4 7 5
5 4 6 1
At only 10k samples, this appears to be ~37x faster than your for loop (which was primarily inefficient because of the appending things one at a time onto sel
, not anything inherent in the for
loop). 在只有1万个样本的情况下,这似乎比您的for循环快约37倍(这主要是效率低下,因为一次将一个内容附加到sel
,而不是for
循环中固有的任何内容)。 The difference between this and a more sensibly written for
loop would likely be much less. 这和更明智地编写for
循环之间的差异可能会小得多。
My approach is not elegant but seems to be efficient when a number of trials is large. 我的方法不够优雅,但是在进行大量试验时似乎很有效。 In order to prove it I created 3 function: f1 - yours, f2 - joran's, f3 - mine 为了证明这一点,我创建了3个函数:f1-您的f2-joran的f3-我的
library(microbenchmark)
f1 <- function() {
x <- 1:4
y <- 5:8
z <- c(x, y) #edited - previous code read a, b in place of x, y
trials <- 5000
sel <- data.frame()
set.seed(123)
for (i in 1:trials) {
x_sel <- sample(x, 1)
y_sel <- sample(y, 1)
rem <- z[!(z %in% c(x_sel, y_sel))]
z_sel <- sample(rem, 1)
sel <- rbind(sel, cbind(x_sel, y_sel, z_sel))
}
return(sel)
}
f2 <- function() {
set.seed(123)
x <- 1:4
y <- 5:8
z <- c(x, y)
trials <- 5000
xval <- sample(x, size = trials, replace = TRUE)
yval <- sample(y, size = trials, replace = TRUE)
zval <-
mapply(
FUN = function(x, y, z) {
sample(setdiff(z, c(x, y)), 1)
},
x = xval,
y = yval,
MoreArgs = list(z = z)
)
result <- data.frame(xval = xval,
yval = yval,
zval = zval)
return(result)
}
f3 <- function() {
x <- 1:4
y <- 5:8
z <- c(x, y) #edited - previous code read a, b in place of x, y
trials <- 5000
set.seed(123)
x_sel <- sample(x, trials, replace = TRUE)
y_sel <- sample(y, trials, replace = TRUE)
z_mac <- matrix(z,
nrow = trials,
ncol = length(z),
byrow = TRUE)
take <- z_mac != x_sel & z_mac != y_sel
z_sel <- t(matrix(t(z_mac)[t(take)], ncol = trials))
take <- sample(1:ncol(z_sel), size = trials, replace = TRUE)
cbind(x_sel, y_sel, z_sel = z_sel[cbind(1:trials, take)])
}
microbenchmark(f1(), f2(), f3(), times = 10L)
Unit:milliseconds
expr min lq mean median uq max neval
f1() 2193.448113 2248.442450 2258.626023 2258.135072 2267.333956 2346.457082 10
f2() 205.124501 208.672947 213.520267 212.208095 219.397101 222.990083 10
f3() 2.463567 2.491762 2.570517 2.512588 2.603582 2.827863 10
My f3 function is 856 times faster than f1 and 83 times faster than f2. 我的f3函数比f1快856倍,比f2快83倍。 When we consider oryginal problem (trials=5) then 当我们考虑原始问题(试验= 5)时
> microbenchmark(f1(), f2(), f3(), times = 10L)
Unit: microseconds
expr min lq mean median uq max neval
f1() 1215.924 1268.790 1296.7610 1300.5095 1321.015 1370.998 10
f2() 587.937 590.500 619.6248 612.9285 638.881 687.261 10
f3() 68.886 78.819 86.7652 81.2225 91.315 116.947 10
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.