[英]Jupyter Notebook/Lab set current directory to ipynb file's
We have an existing workflow in vanilla Jupyter Notebook/Lab where we use relative paths to store outputs of some notebooks.我们在 vanilla Jupyter Notebook/Lab 中有一个现有的工作流程,我们使用相对路径来存储一些笔记本的输出。 Example:例子:
/home/user/notebooks/notebook1.ipynb
/home/user/notebooks/notebook1_output.log
/home/user/notebooks/project1/project.ipynb
/home/user/notebooks/project1/project_output.log
In both notebooks, we produce the output by simply writing to ./output.log
or so.在这两款笔记本中,我们只需写入./output.log
左右即可生成 output。
However, we are now trying Google Dataproc with Jupyter optional component, and the current directory is always /
regardless of which notebook it's run from.但是,我们现在正在尝试使用带有 Jupyter 可选组件的 Google Dataproc,并且当前目录始终是/
,无论它是从哪个笔记本运行的。 This applies for both the notebook and Lab interfaces.这适用于笔记本和实验室界面。
Disabling c.FileContentsManager.root_dir='/'
in /etc/jupyter/jupyter_notebook_config.py
causes the current directory to be set to wherever I started jupyter notebook
from, but it is always that initial starting folder instead of following the .ipynb notebook files.在/etc/jupyter/jupyter_notebook_config.py
中禁用c.FileContentsManager.root_dir='/'
会导致当前目录设置为我启动jupyter notebook
的位置,但它始终是初始起始文件夹,而不是遵循.ipynb笔记本文件.
Any idea on how to restore the "dynamic" current directory behaviour?关于如何恢复“动态”当前目录行为的任何想法?
Even if it's not possible, I'd like to understand how Dataproc even makes Jupyter behave differently.即使不可能,我也想了解 Dataproc 是如何使 Jupyter 表现不同的。
2.0-debian10
Dataproc 映像2.0-debian10
6.2.0
笔记本服务器6.2.0
3.0.18
Jupyterlab 3.0.18
No it is not possible to always get the current directory where your .ipynb file is.不,不可能始终获取.ipynb文件所在的当前目录。 Jupyter is running from the local filesystem
of the master node of your cluster. Jupyter从集群主节点的本地filesystem
运行。 It will always take the default system path for its kernel.它将始终采用其内核的默认系统路径。
In other cases(besides dataproc) also it is not possible to consistently get the path of a Jupyter notebook.在其他情况下(除了 dataproc),也无法始终如一地获取 Jupyter 笔记本的路径。 You can check out this thread regarding this topic.您可以查看有关此主题的此线程。
You have to mention the directory path for your log file to be saved in the desired path.您必须提及要保存在所需路径中的日志文件的目录路径。
Note that the GCS
folder in your Lab refers to the Google Cloud storage Bucket of your cluster.请注意,您实验室中的GCS
文件夹是指您集群的Google Cloud 存储桶。 You can create .ipynb in GCS but when you will execute the file it will be running inside the local system.Thus you will not be able to save log files in GCS directly.您可以在 GCS 中创建.ipynb ,但是当您执行该文件时,它将在本地系统中运行。因此,您将无法直接在GCS中保存日志文件。
EDIT:编辑:
It's not only Dataproc
who makes Jupyter
behave differently.If you use Google Colab
notebooks there you will also see the same behaviour.让Jupyter
行为不同的不仅是Dataproc
。如果您在那里使用Google Colab
笔记本,您也会看到相同的行为。
The reason is because youre always executing code in the kernel
does not matter where the file is.原因是因为您总是executing code in the kernel
与文件在哪里无关。 And in theory multiple notebooks could connect to that kernel.Thus you can't have multiple working directories for the same kernel.理论上,多个笔记本可以连接到该内核。因此,同一个内核不能有多个工作目录。
As I mentioned earlier by default if you're starting a notebook, the current working directory is set to the path of the notebook.正如我之前提到的,默认情况下,如果您正在启动笔记本,则当前工作目录设置为笔记本的路径。
Link to the main thread -> https://github.com/ipython/ipython/issues/10123链接到主线程-> https://github.com/ipython/ipython/issues/10123
Definitely a general solution for most use-cases seems to be what is described right here in the github issue: https://github.com/ipython/ipython/issues/10123#issuecomment-354889020绝对是大多数用例的通用解决方案似乎是 github 问题中描述的内容: https://github.com/ipython/ipython/issues/10123#issuecomment-354889020
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.