简体   繁体   English

如何在 Databricks 中升级 python 版本

[英]How to upgrade python version in Databricks

We upgraded Databricks from 10.3 to 10.4 LTS .我们将Databricks10.3升级到10.4 LTS But the python version did not change from python 3.8.10 .但是python版本没有从python 3.8.10更改。

Question : In Databricks - version 10.4 , how can we upgrade the python version from python 3.8.10 to python 3.10 ?问题:在Databricks - version 10.4中,我们如何将 python 版本从python 3.8.10升级到python 3.10

UPDATE : I would like to use some new functionalities offered in python 3.10 such as match case Statement .更新:我想使用 python 3.10 中提供的一些新功能,例如match case Statement

It might not be possible to upgrade the version of python inside a Databricks cluster .可能无法在 Databricks 集群中升级 python 的版本 Each cluster have a pre-defined configuration which consist of specific versions of Spark, Scala and Python .每个集群都有一个预定义的配置,由特定版本的Spark, Scala and Python

We upgraded Databricks from 10.3 to 10.4 LTS.我们将 Databricks 从 10.3 升级到 10.4 LTS。 But the python version did not change from python 3.8.10但是 python 版本从 python 3.8.10 没有变化

  • This is because both Databricks 10.3 and 10.4 LTS have python version as 3.8.10 .这是因为 Databricks 10.3 和 10.4 LTS 的 python 版本为3.8.10

One solution would have been to Edit the cluster to change to Databricks runtime which supports required configuration.一种解决方案是Edit the cluster以更改为支持所需配置的 Databricks 运行时。 To do this Navigate to compute -> click on your cluster -> edit and choose required databricks runtime.为此,请导航至compute -> click on your cluster -> edit并选择所需的数据块运行时。

But currently, the highest python version supported in Azure databricks is Python 3.9.5 by Databricks runtime 11.1 .但目前,Azure databricks 中支持的最高 python 版本是Databricks runtime 11.1Python 3.9.5 Refer to this Microsoft documentation to understand more about features and configurations of Databricks runtimes请参阅此 Microsoft 文档以了解有关 Databricks 运行时的功能和配置的更多信息

You might be able to install python 3.10.5 on a Docker image that a cluster can utilise instead of the standard runtime.您也许可以在集群可以使用的 Docker 映像上安装 python 3.10.5,而不是标准运行时。

https://docs.databricks.com/clusters/custom-containers.html https://docs.databricks.com/clusters/custom-containers.html

You can build upon the minimal configuration.您可以在最小配置的基础上进行构建。 I have made a minimal example我做了一个最小的例子

FROM databricksruntime/minimal:experimental

# Installs python 3.10 and virtualenv for Spark and Notebooks
RUN apt-get update \
  && apt-get install -y \
    python3.10 \
    virtualenv \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Initialize the default environment that Spark and notebooks will use
RUN virtualenv -p python3.10 --system-site-packages /databricks/python3

# Specifies where Spark will look for the python process
ENV PYSPARK_PYTHON=/databricks/python3/bin/python3

You will need to install all other python libraries, so the process is a bit more tedious.您将需要安装所有其他 python 库,因此该过程有点乏味。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM