[英]How to pass the script path to %run magic command as a variable in databricks notebook?
I want to run a notebook in databricks from another notebook using %run
.我想使用
%run
在另一个笔记本的数据块中运行一个笔记本。 Also I want to be able to send the path of the notebook that I'm running to the main notebook as a parameter.此外,我希望能够将我正在运行的笔记本的路径作为参数发送到主笔记本。
The reason for not using dbutils.notebook.run
is that I'm storing nested dictionaries in the notebook that's called and I wanna use them in the main notebook.不使用
dbutils.notebook.run
的原因是我将嵌套字典存储在被调用的笔记本中,我想在主笔记本中使用它们。
I'm looking for Something like:我正在寻找类似的东西:
path = "/References/parameterDefinition/schemaRepository"
%run <path variable>
You can pass arguments as documented on Databricks web site: https://docs.databricks.com/notebooks/widgets.html#use-widgets-with-run您可以传递 Databricks 网站上记录的参数: https ://docs.databricks.com/notebooks/widgets.html#use-widgets-with-run
In the top notebook you can call在顶部笔记本中,您可以调用
%run /path/to/notebook $X="10" $Y="1"
And then in the sub notebook, you can reference those arguments using the widgets API as in然后在子笔记本中,您可以使用小部件 API 引用这些参数,如
x_value = dbutils.widgets.get("X")
y_value = dbutils.widgets.get("Y")
To your specific question, it would look something like this where "path" is the variable to be referenced via the widgets API in the target notebook:对于您的具体问题,它看起来像这样,其中“路径”是要通过目标笔记本中的小部件 API 引用的变量:
%run /path/to/notebook $path="/path/to/notebook"
Magic commands such as %run
and %fs
do not allow variables to be passed in. %run
和%fs
等魔术命令不允许传入变量。
The workaround is you can use dbutils
as like dbutils.notebook.run(notebook, 300 ,{})
解决方法是您可以像使用
dbutils
dbutils.notebook.run(notebook, 300 ,{})
一样使用 dbutils
Unfortunately it's impossible to pass the path in %run
as variable.不幸的是,不可能将
%run
中的路径作为变量传递。 You can pass variable as parameter only, and it's possible only in combination with with widgets - you can see the example in this answer .您只能将变量作为参数传递,并且只能与小部件结合使用 - 您可以在此答案中看到示例。 In this case you can have all your definitions in one notebook, and depending on the passed variable you can redefine the dictionary.
在这种情况下,您可以将所有定义放在一个笔记本中,并且根据传递的变量,您可以重新定义字典。
There will be a new functionality coming in the next months (approximately, see public roadmap webinar for more details) that will allow to import notebooks as libraries using the import
statement.在接下来的几个月中将会有一个新功能(大约,请参阅公共路线图网络研讨会了解更多详细信息),它将允许使用
import
语句将笔记本作为库导入。 Potentially you can emulate the same functionality by exporting the notebook into the file on disk using the Export command of Workspace API , decoding the data & importing file's content, for example, if you have notebook called module1
with content您可以通过使用Workspace API 的 Export 命令将笔记本导出到磁盘上的文件来模拟相同的功能,解码数据并导入文件的内容,例如,如果您有名为
module1
的笔记本,其中包含内容
my_cool_dict = {"key1": "abc", "key2": 123}
then you can import it as following:然后您可以按以下方式导入它:
import requests
import base64
import os
api_url = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiUrl().get()
host_token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()
path = "/Users/..../module1"
# fetch notebook
response = requests.get(f"{api_url}/api/2.0/workspace/export",
json = {"format": "SOURCE", "path": path},
headers={"Authorization": f"Bearer {host_token}"}
).json()
# decode base64 encoded content
data = base64.b64decode(response["content"].encode("ascii"))
# write the file & __init__.py, so directory will considered a module
dir = os.path.join("/tmp","my_modules")
if not os.path.exists(dir):
os.mkdir(dir)
with open(os.path.join(dir, os.path.split(path)[-1]+".py"), "wb") as f:
f.write(data)
with open(os.path.join(dir, "__init__.py"), "wb") as f:
f.write("\n".encode("ascii"))
# add our directory into system path
import sys
sys.path.append(dir)
# import notebook
from module1 import my_cool_dict
and see that we got our variable:并看到我们得到了我们的变量:
Did you try %run $path
?你试过
%run $path
吗? I believe it should work.我相信它应该有效。
You can't pass it as a variable while running the notebook like this:在像这样运行笔记本时,您不能将其作为变量传递:
In notebook1:在笔记本 1 中:
path_var = "/some/path"
%run ./notebook2
%path=path_var
However what you can do, and what I did, is access the dbutils object or the variable of notebook1 in notebook2:但是,您可以做的和我所做的是访问 notebook2 中的 dbutils object 或 notebook1 的变量:
In notebook1:在笔记本 1 中:
dbutils.widgets.text("path","", "")
path_var = "/some/path"
%run ./notebook2
Then in notebook2:然后在 notebook2 中:
"""
No need to define path widget in Notebook2 like this:
dbutils.widgets.text("path","", "")
"""
path = dbutils.widgets.get("path")
print(path)
Output: /some/path
Output:
/some/path
"""
Or you can access the path_var of Notebook1 directly
without defining it anywhere in Notebook2 like this
"""
print(path_var)
Output: /some/path
Output:
/some/path
So this helps when you are using complicated variables such as heavily nested dictionaries8 in the notebook.因此,当您在 notebook 中使用复杂的变量(例如高度嵌套的字典8)时,这会有所帮助。
What I love about this approach is that environment of notebooks get shared when you call a notebook, meaning you can access variables & methods of Notebook1 in some Notebookn and vice versa such that:我喜欢这种方法的一点是,当您调用笔记本时,笔记本的环境会共享,这意味着您可以在某些 Notebookn 中访问 Notebook1 的变量和方法,反之亦然,这样:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.