[英]How to upgrade python version in Databricks
We upgraded Databricks
from 10.3
to 10.4 LTS
.我们将Databricks
从10.3
升级到10.4 LTS
。 But the python
version did not change from python 3.8.10
.但是python
版本没有从python 3.8.10
更改。
Question : In Databricks - version 10.4
, how can we upgrade the python version from python 3.8.10
to python 3.10
?问题:在Databricks - version 10.4
中,我们如何将 python 版本从python 3.8.10
升级到python 3.10
?
UPDATE : I would like to use some new functionalities offered in python 3.10 such as match case Statement .更新:我想使用 python 3.10 中提供的一些新功能,例如match case Statement 。
It might not be possible to upgrade the version of python inside a Databricks cluster .可能无法在 Databricks 集群中升级 python 的版本。 Each cluster have a pre-defined configuration which consist of specific versions of Spark, Scala and Python
.每个集群都有一个预定义的配置,由特定版本的Spark, Scala and Python
。
We upgraded Databricks from 10.3 to 10.4 LTS.我们将 Databricks 从 10.3 升级到 10.4 LTS。 But the python version did not change from python 3.8.10但是 python 版本从 python 3.8.10 没有变化
3.8.10
.这是因为 Databricks 10.3 和 10.4 LTS 的 python 版本为3.8.10
。 One solution would have been to Edit the cluster
to change to Databricks runtime which supports required configuration.一种解决方案是Edit the cluster
以更改为支持所需配置的 Databricks 运行时。 To do this Navigate to compute -> click on your cluster -> edit
and choose required databricks runtime.为此,请导航至compute -> click on your cluster -> edit
并选择所需的数据块运行时。
But currently, the highest python version supported in Azure databricks is Python 3.9.5
by Databricks runtime 11.1
.但目前,Azure databricks 中支持的最高 python 版本是Databricks runtime 11.1
的Python 3.9.5
。 Refer to this Microsoft documentation to understand more about features and configurations of Databricks runtimes请参阅此 Microsoft 文档以了解有关 Databricks 运行时的功能和配置的更多信息
You might be able to install python 3.10.5 on a Docker image that a cluster can utilise instead of the standard runtime.您也许可以在集群可以使用的 Docker 映像上安装 python 3.10.5,而不是标准运行时。
https://docs.databricks.com/clusters/custom-containers.html https://docs.databricks.com/clusters/custom-containers.html
You can build upon the minimal configuration.您可以在最小配置的基础上进行构建。 I have made a minimal example我做了一个最小的例子
FROM databricksruntime/minimal:experimental
# Installs python 3.10 and virtualenv for Spark and Notebooks
RUN apt-get update \
&& apt-get install -y \
python3.10 \
virtualenv \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
# Initialize the default environment that Spark and notebooks will use
RUN virtualenv -p python3.10 --system-site-packages /databricks/python3
# Specifies where Spark will look for the python process
ENV PYSPARK_PYTHON=/databricks/python3/bin/python3
You will need to install all other python libraries, so the process is a bit more tedious.您将需要安装所有其他 python 库,因此该过程有点乏味。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.