简体   繁体   English

如何使用Amazon Sagemaker pytorch estimator处理嵌套在文件夹中的入口点?

[英]How to handle entrypoints nested in folders with amazon sagemaker pytorch estimator?

I am attempting to run a training job on amazon sagemaker using the python-sagemaker-sdk, estimator class. 我正在尝试使用python-sagemaker-sdk估计器类在亚马逊sagemaker上进行培训。

I have the following 我有以下

estimator = PyTorch(entry_point='training_scripts/train_MSCOCO.py',
                            source_dir='./',
                            role=#dummy_role,
                            train_instance_type='ml.p3.2xlarge',
                            train_instance_count=1,
                            framework_version='1.0.0',
                            output_path=#dummy_output_path,
                            hyperparameters={'lr': 0.001,
                                             'batch_size': 32,
                                             'num_workers': 4,
                                             'description': description})

role and output_path hidden for privacy. 隐藏角色和output_path以保护隐私。

I get the following error, "No module named training_scripts\\train_MSCOCO". 我收到以下错误,“没有模块命名为training_scripts \\ train_MSCOCO”。

When I run python -m training_scripts.train_MSCOCO the script runs fine. 当我运行python -m training_scripts.train_MSCOCO时,脚本运行良好。 However when I pass entry_point='training_script.train_MSCOCO.py it will not run as, "No file named "training_scripts.train_MSCOCO.py" was found in directory "./"". 但是,当我通过entry_point='training_script.train_MSCOCO.py ,它将无法运行,“在目录“ ./”中找不到名为“ training_scripts.train_MSCOCO.py”的文件”。

I am confused as to how to run a nested training script from the top level of my repository within AWS sagemaker, as they seem to have conflicting path needs, one in python module dot notation, the other in standard filepath slash notation. 我对如何从AWS sagemaker中的存储库的顶层运行嵌套的培训脚本感到困惑,因为它们似乎具有冲突的路径需求,一个使用python模块点表示法,另一个使用标准文件路径斜杠表示法。

Either one of these will work: 这些方法之一将起作用:

estimator = PyTorch(entry_point='training_scripts/train_MSCOCO.py',
                    role=#dummy_role,
                    ...

estimator = PyTorch(entry_point='train_MSCOCO.py',
                    source_dir='training_scripts',
                    role=#dummy_role,
                    ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM