简体   繁体   English

您可以在我的 GCP VM 上使用 Jupyter notebook 在 Google Cloud 中运行 TPU 训练吗?

[英]Can you use a Jupyter notebook on my GCP VM to run TPU training in Google Cloud?

I am switching from running TPUs in colab to running TPUs in Google cloud.我正在从在 colab 中运行 TPU 切换到在谷歌云中运行 TPU。 I am used to running training in the colab jupyter notebook, but from the GCP TPU quickstart guide, I'll need to use the shell script, and convert my code into a script.我习惯在 colab jupyter notebook 中运行训练,但根据 GCP TPU 快速入门指南,我需要使用 shell 脚本,并将我的代码转换为脚本。

https://cloud.google.com/tpu/docs/quickstart https://cloud.google.com/tpu/docs/quickstart

Is there way to open a Jupyter notebook version of my GCP VM?有没有办法打开我的 GCP VM 的 Jupyter 笔记本版本?

Yes, you open and run Jupyter notebook on your GCP VM.是的,您在 GCP VM 上打开并运行 Jupyter notebook。 There must be other ways to do this but here's what I followed and worked for me -必须有其他方法可以做到这一点,但这是我遵循并为我工作的方法 -

Phase 1 - Make sure you have set up your GCP Project and set up a VM instance in the zone TPUs are supported.第 1 阶段 - 确保您已设置 GCP 项目并在支持 TPU 的区域中设置 VM 实例。 For mine, I have used us-central1-f.对于我的,我使用了 us-central1-f。

Phase 2 - Make sure you have your VM (Compute Engine), Cloud TPU and Cloud Storage are all set and linked according to instructions provided here - https://cloud.google.com/tpu/docs/quickstart第 2 阶段 - 确保您的 VM(计算引擎)、Cloud TPU 和 Cloud Storage 已根据此处提供的说明进行设置和链接 - https://cloud.google.com/tpu/docs/quickstart

Phase 3 - For VM, you need to enable firewall settings with following第 3 阶段 - 对于 VM,您需要通过以下方式启用防火墙设置

  • Name:姓名:
  • Targets: All instances in the.network目标:.network 中的所有实例
  • Source IP ranges: 0.0.0.0/0来源 IP 范围:0.0.0.0/0
  • Protocols and ports: Select “Specified protocols and ports” option.协议和端口:Select“指定的协议和端口”选项。
  • tcp: 8888 Keep other configuration as default. tcp: 8888 其他配置保持默认。

Phase 4 - You need to install the following:阶段 4 - 您需要安装以下内容:

  • Anaconda Anaconda
wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
bash Anaconda3-4.2.0-Linux-x86_64.sh
  • Tensorflow, Keras and any other libraries you need Tensorflow、Keras 和您需要的任何其他图书馆
source ~/.bashrc
pip install tensorflow
pip install keras

Phase 5 - Make sure you set up your Jupyter configuration第 5 阶段 - 确保设置 Jupyter 配置

$ jupyter notebook --generate-config
$ nano ~/.jupyter/jupyter_notebook_config.py # I use nano editor

Drop these four lines at the top of this config file and save将这四行放在这个配置文件的顶部并保存

c = get_config()
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8888

And that's it.就是这样。 You just need to run你只需要跑

$ jupyter notebook

and hit your browser with http://your_external_IP:8888并在浏览器中输入 http://your_external_IP:8888

If you're using the helm chart for JupyterHub on GKE, it appears that you can also use a profile for JupyterHub as well.如果您在 GKE 上使用 JupyterHub 的 helm 图表,您似乎也可以使用 JupyterHub 的配置文件。 Make sure to set the correct overrides for kubeSpawner settings:确保为 kubeSpawner 设置设置正确的覆盖:

singleuser:
  profileList:
        scheduler_name: default-scheduler
        extra_annotations:
          tf-version.cloud-tpus.google.com: "pytorch-1.11"
        extra_resource_limits:
          cloud-tpus.google.com/v2: 8

It's not documented but you'll need to use the "default-scheduler" since GKE will require it to spawn a TPU instances.它没有记录,但您需要使用“默认调度程序”,因为 GKE 需要它来生成 TPU 实例。

Additional documentation here:此处的其他文档:

https://cloud.google.com/tpu/docs/kube.netes-engine-setup#job-spec https://cloud.google.com/tpu/docs/kube.netes-engine-setup#job-spec

https://jupyterhub-kubespawner.readthedocs.io/en/latest/spawner.html https://jupyterhub-kubespawner.readthedocs.io/en/latest/spawner.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM