简体   繁体   English

使用 Python 解析包含来自 AWS Lambda 的图像的 Base64 编码数据

[英]Parsing Base64 encoded data containing an image from AWS Lambda with Python

I have a Lambda function setup with a POST method that should be able to receive an image as multi-form data, load the image, do some calculations and return a simple array of numbers.我有一个 Lambda function 设置,带有一个POST方法,应该能够接收图像作为多格式数据,加载图像,进行一些计算并返回一个简单的数字数组。 The Lambda function sits behind a API Gateway with Lambda-Proxy integration on and multipart/form-data set as a Binary Media Type. Lambda function 位于 API 网关后面,集成了 Lambda-Proxy,并将multipart/form-data设置为二进制媒体类型。

However, I can't for the life of me seem to figure out how to parse the multi-form data that is returned from AWS Lambda.但是,我似乎无法弄清楚如何解析从 AWS Lambda 返回的多格式数据。

The event['body'] contains base64 encoded data that I can't post here because it takes up too much space. event['body']包含 base64 编码数据,我无法在此处发布,因为它占用了太多空间。

I use the following snip of code to parse the multi-form data:我使用以下代码片段来解析多格式数据:

from requests_toolbelt.multipart import decoder
multipart_string = base64.b64decode(body)
content_type = data['event']['headers']['Content-Type']
multipart_data = decoder.MultipartDecoder(multipart_string, content_type)

where content_type is 'multipart/form-data; boundary=--------------------------881952313555430391739156'其中content_type'multipart/form-data; boundary=--------------------------881952313555430391739156' 'multipart/form-data; boundary=--------------------------881952313555430391739156' . 'multipart/form-data; boundary=--------------------------881952313555430391739156'

Running through the components of multipart_data like this..像这样运行multipart_data的组件..

for part in multipart_data.parts:
    print(part.content)
    print(part.headers)

gives this.给了这个。 The content (too long to post) looks like this:内容(太长无法发布)如下所示:

b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\ ... x00\x7f\xff\xd9'

and the headers:和标题:

{b'Content-Disposition': b'form-data; name="image"; filename="8281460-3x2-700x467.jpg"', b'Content-Type': b'image/jpeg'}

However, it still is not clear to me a ) What part of the content is the actual image?但是,我仍然不清楚a )内容的哪一部分是实际图像? b ) How can I extract the image, and eg get it into PIL with Image.open ? b ) 如何提取图像,例如使用Image.open将其放入PIL中?


Supplementary information:补充资料:

Here is the small Flask app I use to POST the image and return the event data:这是我用来发布图像并返回事件数据的小型 Flask 应用程序:

import json

from flask import Flask, request 

app = Flask(__name__)

@app.route('/', methods=['GET', 'POST'])
def hello(event, context):

    response = {
        "statusCode": 200,
        "event": event
    }

    return {
        "body": json.dumps(response),
    }

and here is the POSTMAN request as Python code:这是 POSTMAN 请求作为 Python 代码:

import requests

url = "url-to-lambda-function"

payload = "------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data; name=\"image\"; filename=\"8281460-3x2-700x467.jpg\"\r\nContent-Type: image/jpeg\r\n\r\n\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW--"
headers = {
    'content-type': "multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW",
    'User-Agent': "PostmanRuntime/7.18.0",
    'Accept': "*/*",
    'Cache-Control': "no-cache",
    'Content-Type': "multipart/form-data; boundary=--------------------------881952313555430391739156",
    'Accept-Encoding': "gzip, deflate",
    'Content-Length': "30417",
    'Connection': "keep-alive",
    'cache-control': "no-cache"
    }

response = requests.request("POST", url, data=payload, headers=headers)

print(response.text)

For anyone coming here, this is how I ended up solving it:对于任何来到这里的人,这就是我最终解决它的方式:

    body = event["body"]

    content_type = event["headers"]["Content-Type"]

    body_dec = base64.b64decode(body)

    multipart_data = decoder.MultipartDecoder(body_dec, content_type)

    binary_content = []

    for part in multipart_data.parts:
        binary_content.append(part.content)

    imageStream = io.BytesIO(binary_content[0])
    imageFile = Image.open(imageStream)
    imageArray = np.array(imageFile) 

which will yield a array that you can work with, as you For me the difficulty was in understanding how multipart/form-data was stitched together again.这将产生一个您可以使用的数组,因为对我来说,困难在于理解 multipart/form-data 如何再次拼接在一起。

AWS documentation says that the maximum payload size for (rest) API gateway is 10MB. AWS 文档说(其余)API 网关的最大有效负载大小为 10MB。 You did not provide your image size, but if it is more than 10MB then consider redesigning your architecture.您没有提供图像大小,但如果超过 10MB,则考虑重新设计您的架构。 I would suggest to upload your image to S3, so your lambda function will return a signed url .我建议将您的图像上传到 S3,因此您的 lambda function 将返回签名的 url After the image is uploaded to S3, you can get this object inside your lambda function and do your calculations.图像上传到 S3 后,您可以在 lambda function 中获取此 object 并进行计算。 https://docs.aws.amazon.com/AmazonS3/latest/dev/UploadObjectPreSignedURLDotNetSDK.html https://docs.aws.amazon.com/AmazonS3/latest/dev/UploadObjectPreSignedURLDotNetSDK.html

To add to tmo's answer: my multipart/form-data posts (to an AWS lambda with API gateway proxy integration) required that I access the content-type header instead with:添加到 tmo 的答案:我的 multipart/form-data 帖子(到 AWS lambda 与 API 网关代理集成)要求我访问内容类型 Z099FB995346F31C749F6E40DB0F395 代替:

content_type = event['multiValueHeaders']['Content-Type'][0]

and then accessing the parts of the form-data from tmo's binary_content list with:然后从 tmo 的 binary_content 列表中访问表单数据的部分:

...
file_content = binary_content[0]
filename = str(binary_content[1].decode())
team_id = str(binary_content[2].decode())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从API读取数据,并使用以base64和gzip编码的Python - Read data from API with Python encoded with base64 and gzip 如何从多部分/表单数据请求中提取二进制数据? (Python)(多部分请求 base64 由 AWS API 网关编码) - How to extract binary data from multipart/form-data request? (Python) (multipart request base64 encoded by AWS API Gateway) 在python中将base64编码的图像解码为原始图像 - decoding a base64 encoded image into the orignal image in python 如何在Python中找到base64编码图像的文件扩展名 - How to find file extension of base64 encoded image in Python Python Face_Recognition with base64 编码图像 - Python Face_Recognition with base64 encoded image 如何在 python 中调整 base64 编码图像的大小 - How to resize base64 encoded image in python python:将base64编码的png图像转换为jpg - python: convert base64 encoded png image to jpg 是否有任何 Python 方法可以将 base64 编码的字符串转换为 Image - Is there any Python method to convert base64 encoded string to Image 将 base64 编码图像嵌入 Dash 数据表 - Embedding an base64 encoded image into Dash Data Table 在Python中将data:image从base64转换为JPEG - Converting data:image from base64 to JPEG in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM