简体   繁体   English

如何打包gcloud ml的子文件夹?

[英]How to package sub-folder for gcloud ml?

I am trying to upload my project to google cloud ml-engine for training. 我正在尝试将我的项目上传到Google Cloud ml-engine进行培训。 I have followed the "getting started" guide, replacing in relevant places with my own files. 我遵循了“入门”指南,在相关位置用我自己的文件替换。

I manage to train locally using 我设法使用

gcloud ml-engine local train --module-name="my-model.task" --package-path=my-model/ -- ./my_model/model_params_google.json

Yes, I have dashes in the module name :(. I also made a symbolic link my_module -> my-module so that I can use the name with underscore instead of dash. In any case, I don't think this is the problem, since the above command works well locally. 是的,我在模块名称中带有破折号:(。我也做了一个符号链接my_module -> my-module以便我可以使用带下划线的名称代替破折号。无论如何,我认为这不是问题所在,因为上述命令在本地效果很好。

My folder structure doesn't follow the recommended one, since I had the project before thinking about ml-engine. 我的文件夹结构没有遵循推荐的文件夹结构,因为我在考虑ml-e​​ngine之前就拥有了该项目。 It looks like this: 看起来像这样:

my-model/
    ├── __init__.py
    ├── setup.py
    ├── task.py
    ├── model_params_google.json
    ├── src
    │   ├── __init__.py
    │   ├── data_handler.py
    │   ├── elastic_helpers.py
    │   ├── model.py

The problem is that the src folder is not packaged/uploaded with the code, so in the cloud, when I say from .src.model import model_fn in task.py , it fails. 问题在于src文件夹未打包/未from .src.model import model_fn代码,因此在云中,当我在task.py from .src.model import model_fnfrom .src.model import model_fn时,它失败了。

The command I use for packaging is (in folder my-model/../ ): 我用于打包的命令是(在文件夹my-model/../ ):

gcloud ml-engine jobs submit training my_model_$(date +"%Y%m%d_%H%M%S") \
    --staging-bucket gs://model-data \
    --job-dir $OUTPUT_PATH \
    --module-name="my_model.task" \
    --package-path=my_model/ \
    --region=$REGION \
    --config config.yaml --runtime-version 1.8 \
    -- \
    tf_crnn/model_params_google.json --verbosity DEBUG

It packages my-model.0.0.0.tar.gz without the contents of my-model/src . 它打包my-model.0.0.0.tar.gz而不包含my-model/src的内容。 I cannot figure out why. 我不知道为什么。 I'm using the example setup.py : 我正在使用示例setup.py

from setuptools import find_packages
from setuptools import setup

REQUIRED_PACKAGES = ['tensorflow>=1.8']

setup(
    name='my_model',
    version='0.1',
    install_requires=REQUIRED_PACKAGES,
    packages=find_packages(),
    include_package_data=True,
    description='my first model'
)

So, the question is, why does gcloud not pack the src folder ? 所以,问题是,为什么gcloud不打包src文件夹?

You need to put the setup.py in the directory above my-model . 您需要将setup.py放在my-model上方的目录中。

You can check your results by invoking: 您可以通过调用以下方法检查结果:

python setup.py sdist

Then un-taring the tarball in the dist directory. 然后解压dist目录中的tarball。 As is, you'll see that task.py is not included in the tarball. task.py ,您会看到task.py不包含在压缩包中。

By moving setup.py one directory higher and repeating, you'll see that task.py is included, as is everything in src. 通过将setup.py移至更高的目录并重复执行,您将看到包括task.py以及src中的所有内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM