简体   繁体   English

oozie:并行运行数百个作业

[英]oozie: running hundreds of job parallel

Initially we have five tables to process, so we created fork for 5 tables as below. 最初,我们有5个表要处理,因此我们为5个表创建了fork,如下所示。 but now we need to process 125 tables in parallel. 但是现在我们需要并行处理125个表。 If I fork all 125 tables, workflow.xml grow bigger and unable to maintain. 如果我分叉所有125个表,则workflow.xml会变大并且无法维护。 How can I configure workflow to process all the 125 tables in parallel. 如何配置工作流以并行处理所有125个表。

<start to="fork-966"/>
<fork name="fork-966">
    <path start="table1_sqoop" />
    <path start="table2_sqoop" />
    <path start="table3_sqoop" />
    <path start="table4_sqoop" />
    <path start="table5_sqoop" />
</fork>

Help appreciated. 帮助表示赞赏。

Sound like 125 may not be the limit .... 听起来像125可能不是极限....

Also, if you need to read from 125 tables I suggest re-think your design. 另外,如果您需要阅读125张桌子,我建议您重新考虑您的设计。

For your current problem , you can do 1 of the next: 对于您当前的问题,您可以执行以下一项操作:

  1. Fork 125 times 前叉125次
  2. Include sub workflow. 包括子工作流程。 This can be param' 这可能是param'
  3. Create bundle that run 125 workflows, so you will write you wf' only once and bundle can re-run it 125 times. 创建可运行125个工作流的包,因此您只需写一次wf',然后包可重新运行125次。 Also if one is fail the rest are still running 另外,如果一个失败了,其余的仍然在运行

Again - I think that you need to re-think you design 再说一次-我认为您需要重新考虑设计

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM