简体   繁体   English

使用开源/免费软件批处理作业依赖性

[英]Batch Job Dependencies Using Open Source/Free Software

I run a large data warehouse plant where we have a lot of nightly jobs running concerruently however many have dependencies on a extract or data load process before they start. 我在一个大型数据仓库工厂中工作,我们每天晚上都有很多工作同步进行,但是许多工作在开始之前就依赖于提取或数据加载过程。 Currently we use an 'expensive scheduling system' to scehdule these at the moment. 目前,我们目前使用“昂贵的调度系统”来计划这些。

Is there any way you can setup job dependencies using an open source or free unix/linux tool such as cron? 有什么方法可以使用开源或免费的unix / linux工具(例如cron)来设置作业依赖关系?

Moving to an open soruce solution would be great and save us lots! 改用开放式的解决方案将是一件好事,可以为我们节省很多!

Regards Matt 关于马特

Cfengine can be made to do something like this. 可以使Cfengine做这样的事情。 You can set it up as a cron replacement, running arbitrary commands at scheduled times, and you can set up "classes" so that certain actions are performed only if certain classes are enabled. 您可以将其设置为cron替代品,在计划的时间运行任意命令,还可以设置“类”,以便仅在启用某些类后才执行某些操作。 Classes can be anything from "this is a Linux system" to "it's currently between 5 and 10 minutes after the hour" to "system load is above value x" to "this arbitrary shell command that I just specified returned true", so you could set up your classes to indicate your job dependencies. 从“这是一个Linux系统”到“当前在5小时到10分钟之间”到“系统负载高于值x”再到“我刚刚指定的此任意shell命令返回true”之类的类都可以,所以您可以设置您的类以指示您的工作依赖性。

I doubt that this would be as powerful as a scheduling system (dependencies would have to be set up manually by configuring classes, scheduling concurrently would requires extra scripting or configuration work), but it is free and open source. 我怀疑它是否像调度系统一样强大(必须通过配置类来手动设置依赖关系,同时调度将需要额外的脚本或配置工作),但是它是免费的且开源的。

Version 2 of Cfengine was not particularly pleasant to work with (in the words of Seth Vidal , "it's [sic] syntax kills kittens"). Cfengine的版本2使用起来并不是特别令人愉快(用Seth Vidal的话说,“它的[原文]语法杀死了小猫”)。 I haven't used Cfengine 3. Puppet has similar design goals as Cfengine and may be easier to work with. 我没有使用Cfengine的3 木偶也有类似的设计目标为Cfengine的,并且可以更容易使用。

I asked a similar question last year (maybe Serverfault would be a better place these days?). 去年我问了一个类似的问题 (也许Serverfault现在会是一个更好的地方?)。 There doesn't seem to be a simple, install-and-go solution unfortunately. 不幸的是,似乎没有一个简单的即装即用的解决方案。

Cron doesn't handle this natively. Cron本身无法处理此问题。 Can the process that loads the data write out a status file upon completion? 加载数据的过程是否可以在完成后写出状态文件? This would allow subsequent jobs to check the status file before doing their real work. 这将允许后续作业在执行其实际工作之前检查状态文件。 Obviously, this isn't an ideal solution (too many points of failure, I suspect), but perhaps it's good enough for what you're trying to accomplish. 显然,这不是一个理想的解决方案(我怀疑故障点太多了),但是对于您要完成的任务而言,这也许已经足够了。

Schedulix是Linux的开源工作负载自动化解决方案: http : //www.schedulix.org

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 是否有任何免费或开源的持久发布/订阅 (PPS) 系统? - Are there any free or open-source persistent publish/subscribe (PPS) systems? VirtualEnv用于软件依赖性 - VirtualEnv for software dependencies 有没有用于发送群发邮件的开源PHP软件,即时事通讯? - Are there any open source PHP software for sending mass mail, i.e. newsletter? 在 Centos 6 中重新启动软件的 Cron 作业 - Cron job to restart a software in Centos 6 如果 GNU/Linux 中的实用程序是从 UNIX 复制的,那么它们如何自由和开源? - How are the utilities in GNU/Linux free and open-source if they're copied from UNIX? 使用BSD查找打开Open Pseudo Terminal的可用端口 - Find free port to Open Pseudo Terminal Using BSD 为具有开放源代码Linux支持的RFID阅读器开发软件:朝硬件/系统方向发展? - Developing software for an RFID reader with open-source Linux support: Which direction to go, hardware-/system-wise? 如何将软件及其相关性打包为一个二进制文件 - how to package software with its dependencies as one binary 在Linux上分发软件时的库依赖性? - Library dependencies when distributing software on Linux? 如何使用puppet中的任何资源分别指定Windows共享路径和Linux共享路径作为软件和rpm的源? - How to give a windows shared path and linux shared path as source for a software and rpm respectively using any resource in puppet?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM