简体繁体 English

通过API网关或Lambda上传AWS S3 Muitipart

[英]AWS S3 Muitipart Upload via API Gateway or Lambda

原文 2018-05-30 16:34:01 2 1 amazon-web-services/ aws-lambda/ aws-sdk/ aws-api-gateway

I'm trying to create a reusable large-file serverless upload service in AWS (we host a number of sites). 我正在尝试在AWS（我们托管许多站点）中创建可重用的大文件无服务器上载服务。 What I would like to do is to set up an API Gateway in AWS and use CORS to control which sites can upload, allowing the sites to use client-side code. 我想做的是在AWS中设置一个API网关，并使用CORS来控制哪些网站可以上传，从而允许这些网站使用客户端代码。 Here is what I've tried and the roadblocks I've run into. 这是我尝试过的以及遇到的障碍。 Wondering if anybody has any suggested workarounds? 想知道是否有人有任何建议的解决方法？

Calling S3 from client-code upload would require me to expose authentication information in client-side land, which seems bad 从客户端代码上传中调用S3将需要我在客户端域中公开身份验证信息，这似乎很糟糕
API Gateway does not appear to support calling S3 multipoint through its AWS Service integration type (URL is fixed to generic S3 service URL, and IAM isn't supported in HTTP integration type) API网关似乎不支持通过其AWS服务集成类型调用S3多点（URL固定为通用S3服务URL，HTTP集成类型不支持IAM）
Leveraging Lambda to call the multipart API won't work, because it can only take in 6 MB of invoke request payload, and to get the 5 MB worth of minimal upload part size, base64 will make the data way more than 6 MB 利用Lambda调用多部分API无效，因为它只能吸收6 MB的调用请求有效负载，并且为了获得5 MB的最小上传部分大小，base64将使数据方式超过6 MB
I could do my own partial upload functionality in Lambda, storing the chunks in S3, but I can't figure out how to merge them together within Lambda's memory and tmp storage space (still PassThrough streams do not appear to work with AWS SDK) 我可以在Lambda中做自己的部分上传功能，将块存储在S3中，但是我不知道如何在Lambda的内存和tmp存储空间中将它们合并在一起（仍然PassThrough流似乎不适用于AWS开发工具包）

Any ideas? 有任何想法吗？ Is any of these worth digging into? 这些值得一探吗？ Or is serverless a no-go for this use case? 还是对于这种用例而言，无服务器是难事吗？

So, after further follow-up with Amazon, it's sort-of possible to use pre-signed URLs with the multipart API, but it's not very practical. 因此，在对Amazon进行进一步跟进之后，可以将预先签名的URL与multipart API结合使用，但这不是很实用。 Steps involved would include the following: 涉及的步骤包括：

Create a new file, and split it into parts. 创建一个新文件，并将其拆分为多个部分。
Generate a presigned URL to initiate the multiart upload. 生成一个预签名的URL以启动多作品上传。
Use the presigned URL to initiate the upload. 使用预签名的URL发起上传。
Generate a presigned URL for each part, using a part number. 使用零件号为每个零件生成一个预签名的URL。
Use the URLs to send the PutPart requests. 使用URL发送PutPart请求。 Keep track of the Etag that is returned for the part number. 跟踪为零件号返回的Etag。
Combine all of the parts and corresponding ETAGs to form the request body. 结合所有部分和相应的ETAG形成请求主体。
Generate a presigned URL to complete the MP upload. 生成一个预签名的URL以完成MP上传。
Complete the multipart upload by sending the request with the presigned complete multipart upload URL. 通过发送带有预先签名的完整分段上传URL的请求来完成分段上传。

Will accept Angelo's answer since it did point in this direction which, technically, seems possible 将接受安杰洛的回答，因为它确实指出了这个方向，从技术上讲，这似乎是可能的

1 个解决方案

You might be able to use presigned urls for the upload. 您也许可以使用预先签名的网址进行上传。 In this case the client would hit your API, which would do whatever validation is necessary, and then generated a presigned url to S3 that is returned to the client. 在这种情况下，客户端将访问您的API，该API将执行所需的任何验证，然后生成指向S3的预签名URL，并将其返回给客户端。 The client then directly uploads to s3. 然后，客户端直接上传到s3。

You can see some information here: https://sanderknape.com/2017/08/using-pre-signed-urls-upload-file-private-s3-bucket/ 您可以在此处查看一些信息： https : //sanderknape.com/2017/08/using-pre-signed-urls-upload-file-private-s3-bucket/