简体   繁体   English

如何将参数传递给Azure机器学习服务中的训练脚本?

[英]How to pass parameters to a training script in Azure Machine Learning service?

I am trying to submit an experiment in Azure Machine Learning service locally on an Azure VM using a ScriptRunConfig object in my workspace ws , as in 我正在尝试使用工作区wsScriptRunConfig对象在Azure VM 本地在Azure Machine Learning服务中提交实验,如下所示:

from azureml.core import ScriptRunConfig    
from azureml.core.runconfig import RunConfiguration
from azureml.core import Experiment

experiment = Experiment(ws, name='test')
run_local = RunConfiguration()

script_params = {
    '--data-folder': './data',
    '--training-data': 'train.csv'
}

src = ScriptRunConfig(source_directory = './source_dir', 
                      script = 'train.py', 
                      run_config = run_local, 
                      arguments = script_params)

run = experiment.submit(src)

However, this fails with 但是,这失败了

ExperimentExecutionException: { "error_details": { "correlation": { "operation": "bb12f5b8bd78084b9b34f088a1d77224", "request": "iGfp+sjC34Q=" }, "error": { "code": "UserError", "message": "Failed to deserialize run definition" ExperimentExecutionException:{“ error_details”:{“ correlation”:{“ operation”:“ bb12f5b8bd78084b9b34f088a1d77224”,“ request”:“ iGfp + sjC34Q =”},“ error”:{“ code”:“ UserError”,“ message”: “无法反序列化运行定义”

Worse, if I set my data folder to use a datastore (which likely I will need to) 更糟糕的是,如果我将数据文件夹设置为使用数据存储区(可能需要使用它)

script_params = {
    '--data-folder': ds.path('mydatastoredir').as_mount(),
    '--training-data': 'train.csv'
}

the error is 错误是

UserErrorException: Dictionary with non-native python type values are not supported in runconfigs. UserErrorException:runco​​nfigs不支持具有非本地python类型值的字典。
{'--data-folder': $AZUREML_DATAREFERENCE_d93269a580ec4ecf97be428cd2fe79, '--training-data': 'train.csv'} {'-数据文件夹':$ AZUREML_DATAREFERENCE_d93269a580ec4ecf97be428cd2fe79,'--training-data':'train.csv'}

I don't quite understand how I should pass my script_params parameters to my train.py ( the documentation of ScriptRunConfig doesn't include a lot of details on this unfortunately). 我不太了解如何将我的script_params参数传递给train.pyScriptRunConfigScriptRunConfig的文档中并未包含很多详细信息)。

Does anybody know how to properly create src in these two cases? 在这两种情况下,有人知道如何正确创建src吗?

The correct way of passing arguments to the ScriptRunConfig and RunConfig is as a list of strings according to https://docs.microsoft.com/nb-no/python/api/azureml-core/azureml.core.runconfiguration?view=azure-ml-py . 将参数传递给ScriptRunConfig和RunConfig的正确方法是根据https://docs.microsoft.com/nb-no/python/api/azureml-core/azureml.core.runco​​nfiguration?view=azure作为字符串列表-ml-py

Modified and working code would be as follows. 修改后的工作代码如下。

from azureml.core import ScriptRunConfig    
from azureml.core.runconfig import RunConfiguration
from azureml.core import Experiment

experiment = Experiment(ws, name='test')
run_local = RunConfiguration()

script_params = [
    '--data-folder',
    './data',
    '--training-data',
    'train.csv'
]

src = ScriptRunConfig(source_directory = './source_dir', 
                      script = 'train.py', 
                      run_config = run_local, 
                      arguments = script_params)

run = experiment.submit(src)

In the end I abandoned ScriptRunConfig and used Estimator as follows to pass script_params (after having provisioned a compute target): 最后,我放弃了ScriptRunConfig并按如下方式使用Estimator传递script_params (在script_params了计算目标之后):

estimator = Estimator(source_directory='./mysourcedir',
                      script_params=script_params,
                      compute_target='cluster',
                      entry_script='train.py',
                      conda_packages = ["pandas"],
                      pip_packages = ["git+https://github.com/..."], 
                      use_docker=True,
                      custom_docker_image='<mydockeraccount>/<mydockerimage>')

This also allowed me to install my pip_packages dependency by putting on https://hub.docker.com/ a custom_docker_image Docker image created from a Dockerfile like: 这还允许我通过将https://hub.docker.com/上custom_docker_image Docker映像安装到pip_packages安装我的pip_packages依赖项,例如:

FROM continuumio/miniconda
RUN apt-get update
RUN apt-get install git gcc g++ -y

(it worked!) (有效!)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用Azure机器学习服务训练大型模型时如何克服TrainingException? - How to overcome TrainingException when training a large model with Azure Machine Learning service? Azure 机器学习在训练样本上失败 - Azure machine learning failing on sample for training 如何在 Azure 机器学习运行脚本中导入模块? - How to import modules in Azure Machine Learning run script? 如何在Azure机器学习上应用学习曲线 - How to apply learning curves on Azure Machine Learning AWS SageMaker 训练脚本:如何传递自定义用户参数 - AWS SageMaker training script: how to pass custom user parameters 在生产中训练机器学习 - Training Machine Learning in Production 在 microsoft azure 机器学习中加载自己的数据以进行远程训练 - Loading own data for remote training in microsoft azure machine learning 如何在Azure Machine Learning Pipelines中使用U-SQL脚本生成的结果上运行Python脚本? - How to run Python script over result generated with U-SQL script in Azure Machine Learning Pipelines? 安排训练和测试机器学习 - Schedule training and testing machine learning 如何在 Azure ML 服务中注册本地训练的机器学习模型? - How can I register in Azure ML Service a machine learning model trained locally?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM