[英]How to import a local module into azure databricks notebook?
I'm trying to use a module in databrick's notebook but I am completely blocked.我正在尝试在 databrick 的笔记本中使用一个模块,但我完全被阻止了。 I'd like to execute the following command or anything similar which allow my to make instances of
MyClass
我想执行以下命令或任何类似的命令来创建
MyClass
实例
from mypackage.mymodule import MyClass
Following databrick's documentation I have develop a python package with a single module locally as follows:按照databrick的文档,我在本地开发了一个带有单个模块的python包,如下所示:
mypackage
|- __init__.py
|- setup.py
|- mymodule.py
Then run python setup.py bdist_wheel
obtaining a .whl
file.然后运行
python setup.py bdist_wheel
获得一个.whl
文件。 The directory ends up being该目录最终成为
mypackage
|- build
|- ... whatever
|- src.egg-info
|- ... whatever
|- dist
|- src-0.1-py3-none-any.whl
|- __init__.py
|- setup.py
|- mymodule.py
From here I've uploaded the .whl
file into the Workspace following the instructions.从这里我按照说明将
.whl
文件上传到工作区。 But now I'm not able to import MyClass
into any notebook.但现在我无法将
MyClass
导入任何笔记本。
I've tried all approches below:我已经尝试了以下所有方法:
.whl
with and without a name..whl
。.whl
installing it into the cluster and not..whl
将其安装到集群中,而不是。import mypackage
import mypackage
dbutils.library.install('dbfs:/path/to/mypackage.whl/')
(which returns True
) and then use import ...
dbutils.library.install('dbfs:/path/to/mypackage.whl/')
(返回True
),然后使用import ...
.whl
, create the package folder in the same directory as the notebook..whl
,而是在与笔记本相同的目录中创建包文件夹。Shared
folderShared
文件夹import differentname
import differentname
This is driving my crazy.这让我发疯了。 I its such a simple task which I can achive easily with regular notebooks.
这是一项如此简单的任务,我可以使用普通笔记本轻松完成。
I've solved this by using python's egg
instead of wheel
.我已经通过使用 python 的
egg
而不是wheel
解决了这个问题。 python setup.py bdist_egg
will create an egg which you can install following databricks docs. python setup.py bdist_egg
将创建一个鸡蛋,您可以按照 databricks 文档安装它。 I don't know why wheel
doesn't work...我不知道为什么
wheel
不工作...
With the the introduction of support for arbitrary files in Databricks Repos , it is now possible to import
custom modules/packages easily, if the module/package resides in the linked git repo.随着在 Databricks Repos 中引入对任意文件的支持,如果模块/包驻留在链接的 git 存储库中,现在可以轻松
import
自定义模块/包。
First,第一的,
Both of these can be enabled from Settings -> Admin Console -> Workspace Settings.这两者都可以从设置 -> 管理控制台 -> 工作区设置中启用。
Then, with the following directory structure in the git repo,然后,在 git repo 中使用以下目录结构,
.
├── mypackage
│ ├── __init__.py
│ └── mymodule.py
└── test_notebook
it is possible to import
the module mymodule
in the package mypackage
from test_notebook
simply by executing the following statement:只需执行以下语句,就可以从
test_notebook
import
包mypackage
的模块mymodule
:
# This is test_notebook in the above filetree
from mypackage.mymodule import MyClass
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.