简体   繁体   English

如何在 Python 3 中的 Databricks 中使用 python 雪花连接器连接到雪花?

[英]How to connect to Snowflake using python snowflake connector from within Databricks in Python 3?

When I try to attach the snowflake-sqlalchemy library to a Python 3 cluster in Databricks it breaks my python build and it gives me the following error when I install subsequent libraries:当我尝试将 snowflake-sqlalchemy 库附加到 Databricks 中的 Python 3 集群时,它破坏了我的 Python 构建,并且在我安装后续库时出现以下错误:

AttributeError: cffi library '_openssl' has no function, constant or global variable named 'Cryptography_HAS_ED25519' AttributeError: cffi 库“_openssl”没有名为“Cryptography_HAS_ED25519”的函数、常量或全局变量

I have tried attaching the latest version of the Cryptography library to the cluster separately however this gave me the same issue.我曾尝试将最新版本的 Cryptography 库分别附加到集群,但这给了我同样的问题。 I think it might be related to the following links:我认为这可能与以下链接有关:

connecting-to-snowflake-from-azure-databricks-notebook-message-openssl-has-no-function-constant-or-global-variable-named-cryptography 从 azure-databricks-notebook-message-openssl-has-no-function-constant-or-global-variable-named-cryptography 连接到雪花

https://github.com/snowflakedb/snowflake-connector-python/issues/32 https://github.com/snowflakedb/snowflake-connector-python/issues/32

In the second link it mentions a workaround:在第二个链接中,它提到了一种解决方法:

The workaround is:
Uninstall cryptography by running pip uninstall cryptography
Delete the directory .../site-packages/cryptography/ manually
Reinstall snowflake-connector-python

Looks like the directory structure of cryptography changed since 1.7.2.*

Is there any way to uninstall the pre-installed cryptography 1.5 python library within Databricks so that I can reinstall the latest version of cryptography (2.5) with the new directory structure?有什么方法可以卸载 Databricks 中预装的密码学 1.5 python 库,以便我可以使用新的目录结构重新安装最新版本的密码学 (2.5)?

Out of date libraries:过时的库:

%sh sudo apt-get install python3-pip -y

Followed by:其次是:

%sh pip3 install --upgrade snowflake-connector-python

See https://datathirst.net/blog/2019/1/11/databricks-amp-snowflake-python-errors for more detail.有关更多详细信息,请参阅https://datathirst.net/blog/2019/1/11/databricks-amp-snowflake-python-errors

I have found an answer to my problem.我已经找到了我的问题的答案。

The issue is caused by the version of openssl in Databricks being too out of date for snowflake-sqlalchemy to work with it.该问题是由 Databricks 中的 openssl 版本过时导致的,snowflake-sqlalchemy 无法使用它。

The solution is as follows:解决方法如下:

  1. Upgrade PIP升级画中画

    %sh /databricks/python/bin/pip install --upgrade pip %sh /databricks/python/bin/pip install --upgrade pip

  2. Uninstall pyopenssl卸载 pyopenssl

    %sh /databricks/python/bin/pip uninstall pyopenssl -y %sh /databricks/python/bin/pip 卸载 pyopenssl -y

  3. Install pyopenssl安装pyopenssl

    %sh /databricks/python/bin/pip install --upgrade pyopenssl %sh /databricks/python/bin/pip install --upgrade pyopenssl

  4. Install snowflake-sqlalchemy安装雪花-sqlalchemy

    %sh /databricks/python/bin/pip install --upgrade snowflake-sqlalchemy %sh /databricks/python/bin/pip install --upgrade snowflake-sqlalchemy

The answer to this question was helpful: Python AttributeError: 'module' object has no attribute 'SSL_ST_INIT'这个问题的答案很有帮助: Python AttributeError: 'module' object has no attribute 'SSL_ST_INIT'

I have created an init file using the following code:我使用以下代码创建了一个 init 文件:

dbutils.fs.mkdirs("dbfs:/databricks/init/")

dbutils.fs.put("dbfs:/databricks/init/sf-initiation.sh" ,"""
#!/bin/bash
/databricks/python/bin/pip install --upgrade pip
/databricks/python/bin/pip uninstall pyopenssl -y
/databricks/python/bin/pip install --upgrade pyopenssl
/databricks/python/bin/pip install --upgrade snowflake-sqlalchemy
""", True)

The last command in the file updates all outdated packages as in: Upgrading all packages with pip文件中的最后一个命令更新所有过时的包,如下所示: Upgrading all packages with pip

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM