简体   繁体   中英

Azure Databrics - Running a Spark Jar from Gen2 DataLake Storage

I am trying to run a spark-submit from Azure Databrics. Currently I can create a job, with the jar uploaded within the Databrics workspace, and run it.

My queries are:

  1. Is there a way to access a jar residing on a GEN2 DataLake storage and do a spark-submit from Databrics workspace, or even from Azure ADF ? (Because the communication between the workspace and GEN2 storage is protected "fs.azure.account.key")

  2. Is there a way to do a spark-submit from a databrics notebook?

Is there a way to access a jar residing on a GEN2 DataLake storage and do a spark-submit from Databrics workspace, or even from Azure ADF ? (Because the communication between the workspace and GEN2 storage is protected "fs.azure.account.key") Unfortunately, you cannot access a jar residing on Azure Storage such as ADLS Gen2/Gen1 account.

Note: The --jars, --py-files, --files arguments support DBFS and S3 paths.

Typically, the Jar libraries are stored under dbfs:/FileStore/jars.

You need to upload libraries in dbfs and pass as the parameters in the jar activity.

For more details, refer " Transform data by running a jar activity in Azure Databricks using ADF ".

Is there a way to do a spark-submit from a databricks notebook?

To answer the second question, you may refer the below Job types:

在此处输入图片说明

Reference: SparkSubmit and " Create a job "

Hope this helps.


If this answers your query, do click “Mark as Answer” and "Up-Vote" for the same. And, if you have any further query do let us know.

Finally I figured out how to run this:

  1. You can do a run a Databricks jar from an ADF, and attach it to an existing cluster, which will have the adls key configured in the cluster.

  2. It is not possible to do a spark-submit from a notebook. But you can create a spark job in jobs, or you can use the Databricks Run Sumbit api, to do a spark-submit.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM