简体   繁体   English

如何将脚本路径作为数据块笔记本中的变量传递给 %run magic 命令?

[英]How to pass the script path to %run magic command as a variable in databricks notebook?

I want to run a notebook in databricks from another notebook using %run .我想使用%run在另一个笔记本的数据块中运行一个笔记本。 Also I want to be able to send the path of the notebook that I'm running to the main notebook as a parameter.此外,我希望能够将我正在运行的笔记本的路径作为参数发送到主笔记本。
The reason for not using dbutils.notebook.run is that I'm storing nested dictionaries in the notebook that's called and I wanna use them in the main notebook.不使用dbutils.notebook.run的原因是我将嵌套字典存储在被调用的笔记本中,我想在主笔记本中使用它们。

I'm looking for Something like:我正在寻找类似的东西:

path = "/References/parameterDefinition/schemaRepository"
%run <path variable>

You can pass arguments as documented on Databricks web site: https://docs.databricks.com/notebooks/widgets.html#use-widgets-with-run您可以传递 Databricks 网站上记录的参数: https ://docs.databricks.com/notebooks/widgets.html#use-widgets-with-run

In the top notebook you can call在顶部笔记本中,您可以调用

%run /path/to/notebook $X="10" $Y="1"

And then in the sub notebook, you can reference those arguments using the widgets API as in然后在子笔记本中,您可以使用小部件 API 引用这些参数,如

x_value = dbutils.widgets.get("X")
y_value = dbutils.widgets.get("Y")

To your specific question, it would look something like this where "path" is the variable to be referenced via the widgets API in the target notebook:对于您的具体问题,它看起来像这样,其中“路径”是要通过目标笔记本中的小部件 API 引用的变量:

 %run /path/to/notebook $path="/path/to/notebook"

Magic commands such as %run and %fs do not allow variables to be passed in. %run%fs等魔术命令不允许传入变量。

The workaround is you can use dbutils as like dbutils.notebook.run(notebook, 300 ,{})解决方法是您可以像使用dbutils dbutils.notebook.run(notebook, 300 ,{})一样使用 dbutils

Unfortunately it's impossible to pass the path in %run as variable.不幸的是,不可能将%run中的路径作为变量传递。 You can pass variable as parameter only, and it's possible only in combination with with widgets - you can see the example in this answer .您只能将变量作为参数传递,并且只能与小部件结合使用 - 您可以在此答案中看到示例。 In this case you can have all your definitions in one notebook, and depending on the passed variable you can redefine the dictionary.在这种情况下,您可以将所有定义放在一个笔记本中,并且根据传递的变量,您可以重新定义字典。

There will be a new functionality coming in the next months (approximately, see public roadmap webinar for more details) that will allow to import notebooks as libraries using the import statement.在接下来的几个月中将会有一个新功能(大约,请参阅公共路线图网络研讨会了解更多详细信息),它将允许使用import语句将笔记本作为库导入。 Potentially you can emulate the same functionality by exporting the notebook into the file on disk using the Export command of Workspace API , decoding the data & importing file's content, for example, if you have notebook called module1 with content您可以通过使用Workspace API 的 Export 命令将笔记本导出到磁盘上的文件来模拟相同的功能,解码数据并导入文件的内容,例如,如果您有名为module1的笔记本,其中包含内容

my_cool_dict = {"key1": "abc", "key2": 123}

then you can import it as following:然后您可以按以下方式导入它:

import requests
import base64
import os


api_url = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiUrl().get()
host_token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()

path = "/Users/..../module1"

# fetch notebook
response = requests.get(f"{api_url}/api/2.0/workspace/export",
    json = {"format": "SOURCE", "path": path},
    headers={"Authorization": f"Bearer {host_token}"}
  ).json()

# decode base64 encoded content
data = base64.b64decode(response["content"].encode("ascii"))

# write the file & __init__.py, so directory will considered a module
dir = os.path.join("/tmp","my_modules")
if not os.path.exists(dir):
    os.mkdir(dir)

with open(os.path.join(dir, os.path.split(path)[-1]+".py"), "wb") as f:
  f.write(data)
with open(os.path.join(dir, "__init__.py"), "wb") as f:
  f.write("\n".encode("ascii"))

# add our directory into system path
import sys
sys.path.append(dir)

# import notebook
from module1 import my_cool_dict

and see that we got our variable:并看到我们得到了我们的变量:

在此处输入图像描述

Did you try %run $path ?你试过%run $path吗? I believe it should work.我相信它应该有效。

Problem问题

You can't pass it as a variable while running the notebook like this:在像这样运行笔记本时,您不能将其作为变量传递:

In notebook1:在笔记本 1 中:

path_var = "/some/path"
%run ./notebook2
%path=path_var

Solution解决方案

However what you can do, and what I did, is access the dbutils object or the variable of notebook1 in notebook2:但是,您可以做的和我所做的是访问 notebook2 中的 dbutils object 或 notebook1 的变量:

In notebook1:在笔记本 1 中:

dbutils.widgets.text("path","", "")
path_var = "/some/path"
%run ./notebook2

Then in notebook2:然后在 notebook2 中:

"""
No need to define path widget in Notebook2 like this:
dbutils.widgets.text("path","", "")
"""
path = dbutils.widgets.get("path")
print(path)

Output: /some/path Output: /some/path

"""
Or you can access the path_var of Notebook1 directly
without defining it anywhere in Notebook2 like this 
"""
print(path_var)

Output: /some/path Output: /some/path

So this helps when you are using complicated variables such as heavily nested dictionaries8 in the notebook.因此,当您在 notebook 中使用复杂的变量(例如高度嵌套的字典8)时,这会有所帮助。

Benefit益处

What I love about this approach is that environment of notebooks get shared when you call a notebook, meaning you can access variables & methods of Notebook1 in some Notebookn and vice versa such that:我喜欢这种方法的一点是,当您调用笔记本时,笔记本的环境会共享,这意味着您可以在某些 Notebookn 中访问 Notebook1 的变量和方法,反之亦然,这样:

  • Notebook1 is calling Notebook2 Notebook1 正在调用 Notebook2
  • Notebook2 is calling Notebook3..... Notebook2 正在调用 Notebook3 .....
  • Notebookn-1 is calling Notebookn Notebookn-1 正在调用 Notebookn

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 魔术命令 %run 可以接受 Azure Databricks 中的动态路径吗? - Can magic command %run accepts dynamic path in Azure Databricks? 如何在 Databricks 中引用笔记本的路径/%run 在做什么? - How can I reference the path of a notebook in Databricks/what is %run doing? 如何从 azure databricks 笔记本运行机器人框架脚本? - How to run a robot framework script from azure databricks notebook? jupyter 运行魔法将参数传递给笔记本 - jupyter run magic pass arguments to notebook 运行命令中的databricks引用变量 - databricks reference variable in run command 如何动态获取Databricks Notebook的路径? - How to get the path of the Databricks Notebook dynamically? Jupyter Lab/Notebook 魔术命令 %load 与平台无关的路径 - Jupyter Lab/Notebook magic command %load with platform independent path 你如何从笔记本或数据块上的命令行运行 pytest? - how do you run pytest either from a notebook or command line on databricks? 从脚本以编程方式调用 Jupyter Notebook 魔术命令 - Call Jupyter Notebook magic command programmatically, from script 是否可以在 IPython 笔记本中的 %%writefile 魔术命令中写入变量的值? - Is it possible to write the value of a variable in a %%writefile magic command in IPython notebook?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM