[英]How to run a non-spark code on databricks cluster?
I am able to pull the data from databricks connect and run spark jobs perfectly.我能够从 databricks connect 中提取数据并完美地运行 spark 作业。 My question is how to run non-spark or native python code on remote cluster.我的问题是如何在远程集群上运行非 spark 或本机 python 代码。 Not sharing the code due to confidentiality.由于机密性,不共享代码。
When you're using databricks connect, then your local machine is a driver of your Spark job, so non-Spark code will be always executed on your local machine .当您使用 databricks connect 时,您的本地计算机就是您的 Spark 作业的驱动程序,因此非 Spark 代码将始终在您的本地计算机上执行。 If you want to execute it remotely, then you need to package it as wheel/egg, or upload Python files onto DBFS (for example, via databricks-cli ) and execute your code as Databricks job (for example, using the Run Submit command of Jobs REST API, or create a Job with databricks-cli and use databricks jobs run-now
to execute it)如果你想远程执行它,那么你需要 package 它作为 wheel/egg,或者将 Python 文件上传到 DBFS(例如,通过databricks-cli )并将你的代码作为 Databricks 作业执行(例如,使用Run Submit 命令作业 REST API,或使用 databricks-cli 创建作业并使用databricks jobs run-now
来执行它)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.