简体   繁体   English

有没有办法根据文件的存在而不是特定时间来触发 cron 作业运行?

[英]Is there a way to trigger a cron job to run based on existence of file and not specific time?

I am trying to use cron r or task scheduler in R to run a script daily based off a.CSV file that gets updated everyday.我正在尝试使用 cron r 或 R 中的任务调度程序来每天运行基于每天更新的 a.CSV 文件的脚本。 The one thing is there is no specific time of the day the CSV file gets updated (let's say on 4/20 it got updated at 3PM but at 4/21 it got updated at 2:30PM and at 4/22 it got updated at 12PM).一件事是 CSV 文件没有更新的具体时间(假设在 4 月 20 日它在下午 3 点更新,但在 4 月 21 日它在下午 2:30 更新,在 4 月 22 日它在中午 12 点)。 The main trigger is not time of day but daily existence of file.主要触发因素不是一天中的时间,而是文件的日常存在。 Is there a way I can run this using either of the R addins?有没有办法可以使用 R 插件中的任何一个来运行它? I use a server at work so I am not using windows task scheduler since R is not on my machine.我在工作中使用服务器,所以我没有使用 windows 任务调度程序,因为 R 不在我的机器上。

Instead of running the cron job every day, run it every 5 minutes (or some reasonable interval), and keep track of when it processed the file.与其每天运行 cron 作业,不如每 5 分钟(或某个合理的时间间隔)运行一次,并跟踪它处理文件的时间。 For example,例如,

needswork <- function(filename, expr, updated = paste0(filename, ".seen")) {
  if (!file.exists(filename)) return(FALSE)
  if (!file.exists(updated)) return(TRUE)
  return(file.info(updated)$mtime < file.info(filename)$mtime)
}
donework <- function(filename, expr, updated = paste0(filename, ".seen")) {
  writeLines(character(0), updated)
}

if (needswork("/path/to/mainfile.csv")) {
  # process the file here
  # ...
  # update
  donework("/path/to/mainfile.csv")
}

I might extend the needswork a little to add notification problems, such as我可能会稍微扩展needswork以添加通知问题,例如

needswork <- function(filename, expr, updated = paste0(filename, ".seen")) {
  if (!file.exists(filename)) return(FALSE)
  if (difftime(Sys.time(), file.info(filename)$mtime, units="secs") > 60*60*24) {
    some_notify_function()
    # perhaps something like
    msg <- paste("The file", sQuote(filename), "has not been updated since",
                 file.info(filename$mtime))
    RPushbullet::pbPost("note", title = "No recent updates", body = msg)
  }
  if (!file.exists(updated)) return(TRUE)
  return(file.info(updated)$mtime < file.info(filename)$mtime)
}

Cron is strictly a time based scheduler. Cron 严格来说是一个基于时间的调度程序。

Having said that, there is a work around.话虽如此,有一个解决方法。

  1. Create a script (eg: mycron.py) as follows创建一个脚本(例如:mycron.py)如下
import os.path

if os.path.isfile("/tmp/myfile.csv"):
  # File exists
  # Do something
else:
  # File does not exist
  pass
  1. Schedule this script (mycron.py) to run at regular intervals安排此脚本 (mycron.py)定期运行

Python script is just an example. Python 脚本只是一个示例。 Feel free to use your fav scripting language随意使用您最喜欢的脚本语言

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM