简体   繁体   English

如何在 R 中的一段时间后在循环内中断 function?

[英]How to interrupt a function within a loop after a certain time in R?

I'm running an algorithm several times through a for loop in R.我通过 R 中的 for 循环多次运行算法。 My loop is very basic and looks like this.我的循环非常基本,看起来像这样。

iter <- 5 #number of iterations
result <- list()

for (i in 1:iter) {
fit <- algorithm() #this is an example function that starts the algorithm
result[[i]] <- print(fit)
}

The problem is that the running times vary greatly with each run.问题是每次运行的运行时间差异很大。 There are runs that take only 10 minutes, others take over an hour.有的跑只需要10分钟,有的需要一个多小时。 However, I know that the longer running times are due to the fact that the algorithm has problems because of the initial values and that the results of these runs will be wrong anyway.但是,我知道较长的运行时间是由于算法由于初始值而存在问题,并且这些运行的结果无论如何都是错误的。

So, I am now looking for a solution that (1) interrupts the function (ie algorithm() in the example above) after eg 1000 seconds, (2) proceeds with the for loop and (3) adds an additional iteration for each interruption.因此,我现在正在寻找一种解决方案,即(1)在例如 1000 秒后中断 function(即上例中的算法()),(2)继续 for 循环,(3)为每个中断添加一个额外的迭代. So, in the end, I want results from five runs with a running time less than 1000 seconds.所以,最后,我想要运行时间少于 1000 秒的五次运行的结果。

Does anyone have an idea?有人有想法吗? Is this even technically possible?这在技术上是否可行? Thanks in advance!提前致谢!

I think you can use setTimeLimit for this.我认为您可以为此使用setTimeLimit

Quick demo:快速演示:

setTimeLimit(elapsed = 2)
Sys.sleep(999)
# Error in Sys.sleep(999) : reached elapsed time limit
setTimeLimit(elapsed = Inf)

(It's important to note that you should return the time limit setting when you no longer desire its interruption.) (重要的是要注意,当您不再希望中断时,您应该返回时间限制设置。)

My "complex algorithm" will sleep a random length.我的“复杂算法”会随机休眠。 Those random lengths are那些随机长度是

set.seed(42)
sleeps <- sample(10, size=5)
sleeps
# [1]  1  5 10  8  2

I'm going to set an arbitrary limit of 6 seconds, beyond which the sleep will be interrupted and we'll get no return value.我将设置一个 6 秒的任意限制,超过该限制睡眠将被中断,我们将不会得到任何返回值。 This should interrupt the third and fourth elements.这应该中断第三和第四个元素。

iter <- 5
result <- list()
for (i in seq_len(iter)) {
  result[[i]] <- tryCatch({
    setTimeLimit(elapsed = 6)
    Sys.sleep(sleeps[[i]])
    setTimeLimit(elapsed = Inf)
     c(iter = i, slp = sleeps[[i]])
  }, error = function(e) NULL)
}
result
# [[1]]
# iter  slp 
#    1    1 
# [[2]]
# iter  slp 
#    2    5 
# [[3]]
# NULL
# [[4]]
# NULL
# [[5]]
# iter  slp 
#    5    2 

If you have different "sleeps" and you end up with a shorter object than you need, just append it:如果您有不同的“睡眠”并且最终得到的 object 比您需要的短,那么只需 append 即可:

result <- c(result, vector("list", 5 - length(result)))

I'll enhance this slightly, for a couple of things:对于几件事,我会稍微增强一下:

  • I prefer lapply to for loops when filling result in this way;以这种方式填充result时,我更喜欢lapply而不是for循环; and
  • since complex algorithms can fail for other reasons, if my sleep failed early then the time limit would not be reset, so I'll use on.exit , which ensures that a function will be called when its enclosure exits, whether due to error or not.由于复杂的算法可能由于其他原因而失败,如果我的睡眠提前失败,那么时间限制将不会被重置,所以我将使用on.exit ,它确保在其外壳退出时调用 function,无论是由于错误还是不是。
result <- lapply(seq_len(iter), function(i) {
  setTimeLimit(elapsed = 6)
  on.exit(setTimeLimit(elapsed = Inf), add = TRUE)
  tryCatch({
    Sys.sleep(sleeps[i])
    c(iter = i, slp = sleeps[i])
  }, error = function(e) NULL)  
})
result
# [[1]]
# iter  slp 
#    1    1 
# [[2]]
# iter  slp 
#    2    5 
# [[3]]
# NULL
# [[4]]
# NULL
# [[5]]
# iter  slp 
#    5    2 

In this case, result is length 5, since lapply will always return something for each iteration.在这种情况下, result的长度为 5,因为lapply总是会为每次迭代返回一些东西。 (The use of lapply is idiomatic for R, where its efficiencies are often in apply and map -like methods, unlike other languages where real speed is realized with literal for loops.) (对于 R, lapply的使用是惯用的,它的效率通常在applymap类的方法中,这与其他语言不同,其他语言通过文字for循环实现真正的速度。)

(BTW: instead of the on.exit logic, I could have used tryCatch(..., finally=setTimeLimit(elapsed=Inf)) as well.) (顺便说一句:我也可以使用tryCatch(..., finally=setTimeLimit(elapsed=Inf))而不是on.exit逻辑。)

An alternative to the on.exit logic is to use setTimeLimit(.., transient=TRUE) from within the execution block to be limited. on.exit逻辑的替代方法是要限制的执行块中使用setTimeLimit(.., transient=TRUE) That would make this code这将使这段代码

result <- lapply(seq_len(iter), function(i) {
  tryCatch({
    setTimeLimit(elapsed = 6, transient = TRUE)
    Sys.sleep(sleeps[i])
    c(iter = i, slp = sleeps[i])
  },
  error = function(e) NULL)
})

One benefit of this is that regardless of the success/interruption of the limited code block, once that is done then the limit is immediately lifted, so there is less risk of inadvertently leaving it in place.这样做的一个好处是,无论有限代码块的成功/中断如何,一旦完成,限制就会立即解除,因此无意中将其留在原位的风险较小。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM