简体   繁体   中英

How to submit hadoop job from another hadoop job

I'm using Oozie in order to schedule a non map-reduce hadoop job. The hadoop job runs in oozie without giving any error. I want to submit another non map-reduce hadoop job using that hadoop job. How can I do that?

In Oozie you can call another action in the same workflow to call Muiltiple hadoop jobs, only points need to be check it should call different action with different action name.

Below Example can help

eg: going to call two shell script by oozie, which will do as 1st script copy.sh will perform distcp action from cluster1 to cluster2 2nd script will perform download the coppied dump from cluster2 to local location. So the workflow will be like:

<workflow-app xmlns='uri:oozie:workflow:0.2' name='testworkflowaction'>
  <start to='UserVectorUpload'/>
   <action name="shellAction_1">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                  <name>oozie.launcher.mapred.job.queue.name</name>
                  <value>default</value>
                </property>
            </configuration>
            <exec>scipt_copy.sh</exec>
                        <file>hdfs://cluster2:8020/localtion_of_script_in_cluster/scipt_copy.sh</file>
            <capture-output/>
        </shell>
        <ok to="shellAction_2"/>
        <error to="killAction"/>
    </action>
         <kill name="killAction">
               <message>Shell Action Failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
     </kill>
     <action name="shellAction_2">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                  <name>oozie.launcher.mapred.job.queue.name</name>
                  <value>default</value>
                </property>
            </configuration>
            <exec>scipt_download.sh</exec>
                        <file>hdfs://cluster2:8020/localtion_of_script_in_cluster/scipt_download.sh</file>
            <capture-output/>
        </shell>
        <ok to="end"/>
        <error to="killAction"/>
    </action>
         <kill name="killAction">
               <message>Shell Action Failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
     </kill>

so by the above example 1st action is to executing script_copy.sh from clustet1 to cluster2 using distcp, and once the distcp is complete it will download the same from cluste2 to local location using get or copyToLocal function.

Another Alternate option you can use, is combined both the action in single script and all the same in single action, though this steps is not usefull with the Script Process is more than complex.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM