简体   繁体   English

使用 python 并行执行 shell 脚本

[英]shell scripts in parallel using python

I have a task to optimize a process and for that, I need to orchestrate multiple shell scripts using python.我的任务是优化流程,为此,我需要使用 python 编排多个 shell 脚本。

Let's assume I have 13 ksh scripts.假设我有 13 个 ksh 脚本。 I want to build the python script such that我想构建 python 脚本,这样

  1. some scripts run in parallel一些脚本并行运行
  2. some scripts are triggered after the completion of their predecessor scripts一些脚本在其前身脚本完成后触发
  3. some scripts are dependent on the output of their predecessor scripts and run according to a decision box.一些脚本依赖于其前身脚本的 output 并根据决策框运行。

I have summarized the above points in a flow diagram.我在流程图中总结了以上几点。

在此处输入图像描述

There is also a special requirement for failure as follows:对于失败还有一个特殊要求如下:

  1. After completion of each shell it should create a particular flag每个 shell 完成后,它应该创建一个特定的标志
  2. If a ksh (assume shell_script_3) fails then the program should stop and not create a flag如果 ksh(假设 shell_script_3)失败,那么程序应该停止并且不创建标志
  3. Once the job is fixed, when the program is re-run it should start from the ksh which is succeeding the failed ksh (shell_script_5)一旦作业被修复,当程序重新运行时,它应该从接替失败的 ksh (shell_script_5) 的 ksh 开始

I need help in designing a python script for the above requirement.在为上述要求设计 python 脚本时,我需要帮助。

this can quickly become a complicated problem.这很快就会成为一个复杂的问题。 but at the core, you have但在核心,你有

  1. a job definition that defines a jobs start condition (for example, time of day and/or completion of other jobs) execution定义作业开始条件(例如,一天中的时间和/或其他作业的完成)执行的作业定义
  2. an execution engine that handles the running of the jobs, in parallel as needed根据需要并行处理作业运行的执行引擎
  3. an event processor that tracks which jobs are awaiting time of or other conditions to start, and processes the status of execution engines.一个事件处理器,它跟踪哪些作业正在等待启动时间或其他条件,并处理执行引擎的状态。 it would also be responsible for dispatching to the execution engine any new jobs, including providing any signalling information used by the execution plan for the decision tree.它还将负责向执行引擎分派任何新作业,包括提供决策树执行计划使用的任何信号信息。
  4. lots of extra things like log access, controls around updating job definitions, cancelling, alerts, run-too-long events, restarting/cancelling jobs.许多额外的东西,如日志访问、更新作业定义的控制、取消、警报、运行时间过长的事件、重新启动/取消作业。

There are a lot of pitfalls in designing this, for example how do you detect and prevent loops in the job dependencies.设计这个有很多陷阱,例如你如何检测和防止作业依赖中的循环。 there are many open source system out there for batch jobs, you should take a look.有许多用于批处理作业的开源系统,你应该看看。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM