简体   繁体   中英

How to schedule a stand alone python script in Oozie

I am trying to set up an oozie job that triggers a python script. Please note this is not a pyspark application but a normal python script.

I want this job to be run on the master node only ( as I have installed required dependent libraries on master node alone).

Is there any way to run this job on master from oozie scheduling ?

Your answers are much appreciated!.

I have installed dependent libraries on the master node and have run the python script manually , it is working as expected. All I am trying now is to scheduling it from oozie.

There is no such thing as a python action in oozie. Nearest thing you can do is call a shell action from oozie and have the shell script invoke the python code.

 <action name="action_name">
        <shell xmlns="uri:oozie:shell-action:0.1">
            ...
            ...
            <exec>shell_script.sh</exec>
            <argument>[ARGS]</argument>
            <file>[FULL_PATH_TO_SHELL_SCRIPT]</file>
        </shell>
        <ok to ="action2"/>
        <error to = "action2"/>
 </action>

assuming you are looking how to put pyspark step in your code

You either schedule a shell-action and put command spark-submit : https://oozie.apache.org/docs/3.3.0/DG_ShellActionExtension.html

Or you can use dedicated spark node: https://oozie.apache.org/docs/4.2.0/DG_SparkActionExtension.html

Just use a regular shell-action. Oozie doesn't care about the language the script is written in. Interpreter will be chosen based on the script's hashbang .

Just remember that the 1st line of script.py should be something like #!/usr/bin/python3 (or wherever your Python interpreter is), then sample Oozie action to call a Python script can look like this (ie pretty standard):

  <action name='run_python_script'>
    <shell xmlns="uri:oozie:shell-action:0.1">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <exec>script.py</exec>
      <argument>some argument</argument>
      <argument>some other argument<argument>
      <file>script.py</file>
    </shell>
    <ok to="end" />
    <error to="fail" />
  </action>

If you forget to add a hashbang to the Python script, the OS will try to run it with /bin/sh which will probably fail with some syntax error.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM