简体   繁体   English

Jupyter Notebook/Lab 将当前目录设置为 ipynb 文件的

[英]Jupyter Notebook/Lab set current directory to ipynb file's

Desired behaviour期望的行为

We have an existing workflow in vanilla Jupyter Notebook/Lab where we use relative paths to store outputs of some notebooks.我们在 vanilla Jupyter Notebook/Lab 中有一个现有的工作流程,我们使用相对路径来存储一些笔记本的输出。 Example:例子:

  • /home/user/notebooks/notebook1.ipynb
  • /home/user/notebooks/notebook1_output.log
  • /home/user/notebooks/project1/project.ipynb
  • /home/user/notebooks/project1/project_output.log

In both notebooks, we produce the output by simply writing to ./output.log or so.在这两款笔记本中,我们只需写入./output.log左右即可生成 output。

Problem问题

However, we are now trying Google Dataproc with Jupyter optional component, and the current directory is always / regardless of which notebook it's run from.但是,我们现在正在尝试使用带有 Jupyter 可选组件的 Google Dataproc,并且当前目录始终是/ ,无论它是从哪个笔记本运行的。 This applies for both the notebook and Lab interfaces.这适用于笔记本和实验室界面。

What I've tried我试过的

Disabling c.FileContentsManager.root_dir='/' in /etc/jupyter/jupyter_notebook_config.py causes the current directory to be set to wherever I started jupyter notebook from, but it is always that initial starting folder instead of following the .ipynb notebook files./etc/jupyter/jupyter_notebook_config.py中禁用c.FileContentsManager.root_dir='/'会导致当前目录设置为我启动jupyter notebook的位置,但它始终是初始起始文件夹,而不是遵循.ipynb笔记本文件.

Any idea on how to restore the "dynamic" current directory behaviour?关于如何恢复“动态”当前目录行为的任何想法?

Even if it's not possible, I'd like to understand how Dataproc even makes Jupyter behave differently.即使不可能,我也想了解 Dataproc 是如何使 Jupyter 表现不同的。

Details细节

  • Dataproc Image 2.0-debian10 Dataproc 映像2.0-debian10
  • Notebook Server 6.2.0笔记本服务器6.2.0
  • Jupyterlab 3.0.18 Jupyterlab 3.0.18

No it is not possible to always get the current directory where your .ipynb file is.不,不可能始终获取.ipynb文件所在的当前目录。 Jupyter is running from the local filesystem of the master node of your cluster. Jupyter从集群主节点的本地filesystem运行。 It will always take the default system path for its kernel.它将始终采用其内核的默认系统路径。

In other cases(besides dataproc) also it is not possible to consistently get the path of a Jupyter notebook.在其他情况下(除了 dataproc),也无法始终如一地获取 Jupyter 笔记本的路径。 You can check out this thread regarding this topic.您可以查看有关此主题的此线程

You have to mention the directory path for your log file to be saved in the desired path.您必须提及要保存在所需路径中的日志文件的目录路径。

Note that the GCS folder in your Lab refers to the Google Cloud storage Bucket of your cluster.请注意,您实验室中的GCS文件夹是指您集群的Google Cloud 存储桶 You can create .ipynb in GCS but when you will execute the file it will be running inside the local system.Thus you will not be able to save log files in GCS directly.您可以在 GCS 中创建.ipynb ,但是当您执行该文件时,它将在本地系统中运行。因此,您将无法直接在GCS中保存日志文件。


EDIT:编辑:

It's not only Dataproc who makes Jupyter behave differently.If you use Google Colab notebooks there you will also see the same behaviour.Jupyter行为不同的不仅是Dataproc 。如果您在那里使用Google Colab笔记本,您也会看到相同的行为。

The reason is because youre always executing code in the kernel does not matter where the file is.原因是因为您总是executing code in the kernel与文件在哪里无关。 And in theory multiple notebooks could connect to that kernel.Thus you can't have multiple working directories for the same kernel.理论上,多个笔记本可以连接到该内核。因此,同一个内核不能有多个工作目录。

As I mentioned earlier by default if you're starting a notebook, the current working directory is set to the path of the notebook.正如我之前提到的,默认情况下,如果您正在启动笔记本,则当前工作目录设置为笔记本的路径。

Link to the main thread -> https://github.com/ipython/ipython/issues/10123链接到主线程-> https://github.com/ipython/ipython/issues/10123

Definitely a general solution for most use-cases seems to be what is described right here in the github issue: https://github.com/ipython/ipython/issues/10123#issuecomment-354889020绝对是大多数用例的通用解决方案似乎是 github 问题中描述的内容: https://github.com/ipython/ipython/issues/10123#issuecomment-354889020

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Jupyter 实验室/笔记本无法打开 - Jupyter Lab / Notebook won't open 当 SSH 在本地机器上终止时,让 jupyter lab notebook 保持运行? - Keep jupyter lab notebook running when SSH is terminated on local machine? 如何将文件从 s3 保存到当前 jupyter 目录 - How to save files from s3 into current jupyter directory 如何通过 amazon workspace 中的 EMR jupyter lab notebook 读取 postgres DB 表 - How to read postgres DB tables through EMR jupyter lab notebook from amazon workspace 使用 Google Cloud 和 Jupyter Lab 时出现“找不到文件”错误 - "File was not found" error using Google Cloud and Jupyter Lab 在 Jupyter Notebook 中禁用“上传” - Disable "Upload" in Jupyter notebook 如何将数据从 S3 存储桶加载到 Sagemaker jupyter notebook 以训练 model? - How to load data from your S3 bucket to Sagemaker jupyter notebook to train the model? 无法将数据从 S3 存储桶访问到 aws sagemaker 的 jupyter 笔记本 - Unable to access data from S3 bucket to jupyter notebook of aws sagemaker Jupyter Notebook 在 github 上看起来很奇怪 - Jupyter Notebook looks weird on github 在 Azure 中自动执行 Jupyter notebook - Automating the execution of a Jupyter notebook in Azure
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM