简体   繁体   中英

notebook to execute Databricks job

Is there an api or other way to programmatically run a Databricks job. Ideally, we would like to call a Databricks job from a notebook. Following just gives currently running job id but that's not very useful:

dbutils.notebook.entry_point.getDbutils().notebook().getContext().currentRunId().toString()

To run a databricks job, you can use Jobs API . I have a databricks job called for_repro which I ran using the 2 ways provided below from databricks notebook.

Using requests library:

  • You can create an access token by navigating to Settings -> User settings . Under Access token tab, click generate token.
  • Use the above generated token along with the following code.
import requests
import json

my_json = {"job_id": <your_job-id>}    

auth = {"Authorization": "Bearer <your_access-token>"}

response = requests.post('https://<databricks-instance>/api/2.0/jobs/run-now', json = my_json, headers=auth).json()
print(response)

在此处输入图像描述


  • The <databricks-instance> value from the above code can be extracted from your workspace URL.

在此处输入图像描述


Using %sh magic command script:

  • You can also use magic command %sh in your python notebook cell to run a databricks job.
%sh

curl --netrc --request POST --header "Authorization: Bearer <access_token>" \
https://<databricks-instance>/api/2.0/jobs/run-now \
--data '{"job_id": <your job id>}'

在此处输入图像描述

  • The following is my job details and run history for reference.在此处输入图像描述

Refer to this Microsoft documentation to know all other operations that can be achieved using Jobs API .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM