简体   繁体   English

调用AWS Sagemaker端点

[英]Invoke aws sagemaker endpoint

I have some data in S3 and I want to create a lambda function to predict the output with my deployed aws sagemaker endpoint then I put the outputs in S3 again. 我在S3中有一些数据,我想创建一个lambda函数来预测已部署的aws sagemaker端点的输出,然后将输出再次放入S3中​​。 Is it necessary in this case to create an api gateway like decribed in this link ? 在这种情况下是否有必要创建此链接中所述的api网关? and in the lambda function what I have to put. 在lambda函数中,我必须输入。 I expect to put (where to find the data, how to invoke the endpoint, where to put the data) 我希望放置(在何处查找数据,如何调用端点,在何处放置数据)

import boto3
import io
import json
import csv
import os


client = boto3.client('s3') #low-level functional API

resource = boto3.resource('s3') #high-level object-oriented API
my_bucket = resource.Bucket('demo-scikit-byo-iris') #subsitute this for your s3 bucket name. 

obj = client.get_object(Bucket='demo-scikit-byo-iris', Key='foo.csv')
lines= obj['Body'].read().decode('utf-8').splitlines()
reader = csv.reader(lines)

import io
file = io.StringIO(lines)

# grab environment variables
runtime= boto3.client('runtime.sagemaker')

response = runtime.invoke_endpoint(
    EndpointName= 'nilm2',
    Body = file.getvalue(),
    ContentType='*/*',
    Accept = 'Accept')

output = response['Body'].read().decode('utf-8')

my data is a csv file of 2 columns of floats with no headers, the problem is that lines return a list of strings(each row is an element of this list:['11.55,65.23', '55.68,69.56'...]) the invoke work well but the response is also a string: output = '65.23\\n,65.23\\n,22.56\\n,...' 我的数据是2列无标题的float的csv文件,问题是行返回字符串列表(每行是该列表的元素:['11.55,65.23','55 .68,69.56'... ])调用工作正常,但响应也是一个字符串:output = '65 .23 \\ n,65.23 \\ n,22.56 \\ n,...'

So how to save this output to S3 as a csv file 那么如何将此输出作为csv文件保存到S3

Thanks 谢谢

If your Lambda function is scheduled, then you won't need an API Gateway. 如果您的Lambda函数已安排好,则不需要API网关。 But if the predict action will be triggered by a user, by an application, for example, you will need. 但是,如果预测动作将由用户(例如,应用程序)触发,则将需要。

When you call the invoke endpoint, actually you are calling a SageMaker endpoint, which is not the same as an API Gateway endpoint. 调用调用端点时,实际上是在调用SageMaker端点,该端点与API Gateway端点不同。

A common architecture with SageMaker is: SageMaker的常见体系结构是:

  1. API Gateway with receives a request then calls an authorizer, then invoke your Lambda; 的API网关接收请求,然后调用授权者,然后调用您的Lambda;
  2. A Lambda with does some parsing in your input data, then calls your SageMaker prediction endpoint, then, handles the result and returns to your application. 具有的Lambda在输入数据中进行一些解析,然后调用SageMaker预测端点,然后处理结果并返回到您的应用程序。

By the situation you describe, I can't say if your task is some academic stuff or a production one. 根据您所描述的情况,我不能说您的任务是学术方面的工作还是生产方面的工作。

So, how you can save the data as a CSV file from your Lambda? 因此,如何从Lambda将数据另存为CSV文件?

I believe you can just parse the output, then just upload the file to S3. 我相信您可以解析输出,然后将文件上传到S3。 Here you will parse manually or with a lib, with boto3 you can upload the file. 在这里,您可以手动解析或使用lib解析,使用boto3可以上传文件。 The output of your model depends on your implementation on SageMaker image. 模型的输出取决于您在SageMaker映像上的实现。 So, if you need the response data in another format, maybe you will need to use a custom image . 因此,如果您需要其他格式的响应数据,则可能需要使用自定义图片 I normally use a custom image, which I can define how I want to handle my data on requests/responses. 我通常使用自定义图像,该图像可以定义我要如何处理请求/响应上的数据。

In terms of a production task, I certainly recommend you check Batch transform jobs from SageMaker. 在生产任务方面,我当然建议您从SageMaker中检查“批量转换”作业。 You can provide an input file (the S3 path) and also a destination file (another S3 path). 您可以提供输入文件(S3路径)和目标文件(另一个S3路径)。 The SageMaker will run the batch predictions and will persist a file with the results. SageMaker将运行批次预测,并将持久保存结果文件。 Also, you won't need to deploy your model to an endpoint, when this job run, will create an instance of your endpoint, download your data to predict, do the predictions, upload the output, and shut down the instance. 此外,您无需将模型部署到端点,运行此作业时,将创建端点的实例,下载数据进行预测,进行预测,上载输出,然后关闭实例。 You only need a trained model. 您只需要训练有素的模型。

Here some info about Batch transform jobs: 以下是有关批量转换作业的一些信息:

https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html

https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-batch-transform.html https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-batch-transform.html

I hope it helps, let me know if need more info. 希望对您有所帮助,如果需要更多信息,请告诉我。

Regards. 问候。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM