简体   繁体   中英

Azure Databricks with Python scripts

I am new to Python. Need help with Azure databricks.

Scenario:

Currently I am working on a project which uses HDInsight cluster to submit spark jobs and they use Python script with classes and functions [.py] which resides in the /bin/ folder in the edge node.

We propose to use Databricks instead of HDInsight cluster and the PoC requires minimum effort.

Doubts:

  1. In the HDInsight cluster all the python scripts are stored in /bin/ folder and conf files with.yml in /conf/ folder.

Can we replicate the same structure in the databricks DBFS so that minimum changes in the code to replicate the location.

2.I am new to Python, I have a bunch of scripts in the /bin/ folder. How can I upload or install those scripts in databricks.

My assumption is, I need to create a package and install on the cluster as a library. Correct me if I am wrong.

  1. How do I run the Python scripts from Databricks.

@Sathya Can you provide more information on what the different python scripts as well as the config files do?

As for the python scripts, depending on what their function is, you could create one or more python notebooks in Databricks and copy the contents into them. You can then run these notebooks as part of a job or reference them in other notebooks with %run /path/to/notebook

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM