简体   繁体   中英

How to upgrade python version in Databricks

We upgraded Databricks from 10.3 to 10.4 LTS . But the python version did not change from python 3.8.10 .

Question : In Databricks - version 10.4 , how can we upgrade the python version from python 3.8.10 to python 3.10 ?

UPDATE : I would like to use some new functionalities offered in python 3.10 such as match case Statement .

It might not be possible to upgrade the version of python inside a Databricks cluster . Each cluster have a pre-defined configuration which consist of specific versions of Spark, Scala and Python .

We upgraded Databricks from 10.3 to 10.4 LTS. But the python version did not change from python 3.8.10

  • This is because both Databricks 10.3 and 10.4 LTS have python version as 3.8.10 .

One solution would have been to Edit the cluster to change to Databricks runtime which supports required configuration. To do this Navigate to compute -> click on your cluster -> edit and choose required databricks runtime.

But currently, the highest python version supported in Azure databricks is Python 3.9.5 by Databricks runtime 11.1 . Refer to this Microsoft documentation to understand more about features and configurations of Databricks runtimes

You might be able to install python 3.10.5 on a Docker image that a cluster can utilise instead of the standard runtime.

https://docs.databricks.com/clusters/custom-containers.html

You can build upon the minimal configuration. I have made a minimal example

FROM databricksruntime/minimal:experimental

# Installs python 3.10 and virtualenv for Spark and Notebooks
RUN apt-get update \
  && apt-get install -y \
    python3.10 \
    virtualenv \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Initialize the default environment that Spark and notebooks will use
RUN virtualenv -p python3.10 --system-site-packages /databricks/python3

# Specifies where Spark will look for the python process
ENV PYSPARK_PYTHON=/databricks/python3/bin/python3

You will need to install all other python libraries, so the process is a bit more tedious.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM