简体   繁体   English

无效的输出扩展名导致Azure ML实验批处理Web服务调用失败

[英]Azure ML Experiment Batch Webservice Call Fails with Invalid Output Extension

I have an Azure webjob that is calling a ML training experiment via HttpRequests, leveraging the code generated in the ML webportal: 我有一个Azure网络作业正在通过HttpRequests利用ML网络门户中生成的代码来调用ML训练实验:

var request = new BatchExecutionRequest()
            {
                Inputs = new Dictionary<string, AzureBlobDataReference>() {
                    {
                        "input1",
                        new AzureBlobDataReference()
                        {
                            ConnectionString = _connectionString,
                            RelativeLocation = $"{_containerName}/{experimentId}/{tenantId}/{trainingDataFileName}"
                        }
                    },
                },

                Outputs = new Dictionary<string, AzureBlobDataReference>() {
                    {
                        "output1",
                        new AzureBlobDataReference()
                        {
                            ConnectionString = "azureStorageConnectionString",
                            RelativeLocation = $"{_containerName}/{experimentId}/{tenantId}/Model_2018421.ilearner"
                        }
                    },
                },

                GlobalParameters = new Dictionary<string, string>()
                {
                }
            };

However, the request fails with the following message: 但是,请求失败并显示以下消息:

The blob reference: experiments/experimentId/TenantId/Model_2018421.ilearner has an invalid or missing file extension. Blob参考:experiments / experimentId / TenantId / Model_2018421.ilearner的文件扩展名无效或丢失。 Supported file extensions for this output type are: \\\\".csv, .tsv, .arff\\\\" 此输出类型支持的文件扩展名是:\\\\“。csv,.tsv,.arff \\\\”

I'm pretty confused about this, since it's written right the documentation all over the place that if I'm expecting a trained model to use ".ilearner" as the file extension for the model. 我对此感到很困惑,因为如果我希望训练有素的模型使用“ .ilearner”作为模型的文件扩展名,那么它的文档就写在各处。

I've seen this question asking about the same error leveraging the DataFactory, and also this question on datascience.stackexchange . 我已经看到这个问题询问使用DataFactory的相同错误,以及关于datascience.stackexchange的问题 Neither one had any clues, answers, or other follow up. 没有人提供任何线索,答案或其他后续行动。

Any insight on what I'm missing would be greatly appreciated! 对于我所缺少的任何见解将不胜感激!

For anyone looking for your "Don't Overthink It" moment of the day: 对于每天寻找您“别想太多”的任何人:

I needed to provide TWO output blob file references: 我需要提供两个输出Blob文件参考:

var request = new BatchExecutionRequest()
            {
                Inputs = new Dictionary<string, AzureBlobDataReference>() {
                    {
                        "input1",
                        new AzureBlobDataReference()
                        {
                            ConnectionString = _connectionString,
                            RelativeLocation = $"{_containerName}/{experimentId}/{tenantId}/{trainingDataFileName}.csv"
                        }
                    },
                },

                Outputs = new Dictionary<string, AzureBlobDataReference>() {
                    {
                        "output1",
                        new AzureBlobDataReference()
                        {
                            ConnectionString = _connectionString,
                            RelativeLocation = $"{_containerName}/{experimentId}/{tenantId}/{outputFileNameCsv}.csv"
                        }
                    },
                    {
                        "output2",
                        new AzureBlobDataReference()
                        {
                            ConnectionString = _connectionString,
                            RelativeLocation = $"{_containerName}/{experimentId}/{tenantId}/{outputFileNameIlearner}.ilearner"
                        }
                    },
                },

                GlobalParameters = new Dictionary<string, string>()
                {
                }
            };

There's an old saying in American English about not making assumptions, and I assumed the second output was an optional parameter used in batch operations. 美国英语有一句老话,就是不做假设,我认为第二个输出是批处理操作中使用的可选参数。 Since I'm not actually looking for more than one result from each call, I thought I was safe to remove the second output parameter. 由于实际上每次调用都不会寻找多个结果,因此我认为可以安全地删除第二个输出参数。

TL/DR: Keep all the parameters the webservice portal's "Consume" tab generates, and make sure the first one is a .csv file reference. TL / DR:保留Web服务门户的“消费”选项卡生成的所有参数,并确保第一个参数是.csv文件引用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM