upper.limit <- 15
starting.limit <- 5
lower.limit <- 0
set.seed(123)
x <- sample(-20:20)
for(i in 1:length(x)){
k <- starting.limit + x[i]
k <- ifelse(k > upper.limit, upper.limit, ifelse(k < lower.limit, lower.limit,k))
starting.limit <- k
}
My objective is to calculate the final value of starting limit
at the end of the loop. The conditions are that for a given iteration, k
cannot exceed upper.limit
and fall below lower.limit
.
I have written the above loop to achieve this. However, I have to do this for almost 10000 datasets. I wondered if there is a quicker way so that I can avoid a for loop
Thanks
We can design a function.
# s: starting.limit, x: the x vector, u:upper.limit, l:lower.limit
k_fun <- function(s, x, u = 15, l = 0){
k <- s + x
if (k > u){
k <- u
} else if (k < l){
k <- l
}
s <- k
return(s)
}
And then use accumulate
from the purrr
package to apply the function with the starting limit and the x vector. You can see how the number changes. The last number is the final output.
library(purrr)
accumulate(c(5, x), k_fun)
# [1] 5 0 11 6 15 15 0 0 10 15 9 15 8 7 3 0 3 0 15 2 2 14 15 7 4 15 15 3 15 0
# [31] 5 0 0 4 12 0 6 7 9 0 0 15
Benchmark
I used the following code to assess the performance. The accumulate
is a little bit faster than a for loop on a vector with 400001
elements.
library(microbenchmark)
perf <- microbenchmark(
m1 = {upper.limit <- 15
starting.limit <- 5
lower.limit <- 0
set.seed(123)
x <- sample(-200000:200000)
for(i in 1:length(x)){
k <- starting.limit + x[i]
k <- ifelse(k > upper.limit, upper.limit, ifelse(k < lower.limit, lower.limit,k))
starting.limit <- k
}},
m2 = {
set.seed(123)
x <- sample(-200000:200000)
vec <- purrr::accumulate(c(5, x), k_fun)
k <- tail(vec, 1)
})
# Unit: milliseconds
# expr min lq mean median uq max neval
# m1 821.1735 879.3551 956.7404 941.1145 1019.8603 1290.800 100
# m2 649.3444 717.5986 773.3652 768.0313 823.5749 1006.148 100
you can try something like below with tidyverse
first, make x
into a dataframe
x <- as.data.frame(sample(-20:20))
colnames(x) <- c("dat")
and then pipe like:
x %>%
mutate(sm = starting.limit) %>%
mutate(sm = if_else(sm+lead(dat,1) > upper.limit, upper.limit
, if_else(sm+lead(dat,1) < lower.limit, lower.limit, sm) )) %>%
select(sm) %>%
filter(sm != is.na(sm)) %>%
tail(n=1)
Effectively, modify the last select
, filter
and tail
functions as per your need.
Benchmark
I was curious how this performs against the other solution, and tried to add my code to the microbenchmark already provided. Here goes
perf <- microbenchmark(
m1 = {upper.limit <- 15
starting.limit <- 5
lower.limit <- 0
set.seed(123)
x <- sample(-200000:200000)
for(i in 1:length(x)){
k <- starting.limit + x[i]
k <- ifelse(k > upper.limit, upper.limit, ifelse(k < lower.limit, lower.limit,k))
starting.limit <- k
}},
m2 = {
set.seed(123)
x <- sample(-200000:200000)
vec <- purrr::accumulate(c(5, x), k_fun)
k <- tail(vec, 1)
},
m3 = {
x <- sample(-200000:200000)
xd <- as.data.frame(x)
colnames(xd) <- c("dat")
xd %>%
mutate(sm = starting.limit) %>%
mutate(sm = if_else(sm+lead(dat,1) > upper.limit, upper.limit
, if_else(sm+lead(dat,1) < lower.limit, lower.limit, sm) )) %>%
select(sm) %>%
filter(sm != is.na(sm)) %>%
tail(n=1)
}
)
output:
Unit: milliseconds
expr min lq mean median uq max neval
m1 1223.49718 1255.69514 1272.2679 1260.9643 1272.3401 1392.0402 100
m2 964.76948 982.96555 1007.5521 989.5366 1007.9106 1173.2754 100
m3 68.80358 76.77386 133.0509 170.5572 177.0051 274.9299 100
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.