简体   繁体   English

Uppy Companion 不适用于具有多部分 S3 上传的 > 5GB 文件

[英]Uppy Companion doesn't work for > 5GB files with Multipart S3 uploads

Our app allow our clients large file uploads.我们的应用程序允许我们的客户上传大文件。 Files are stored on AWS/S3 and we use Uppy for the upload, and dockerize it to be used under a kubernetes deployment where we can up the number of instances.文件存储在 AWS/S3 上,我们使用 Uppy 进行上传,并将其 dockerize 以在 kubernetes 部署下使用,我们可以在其中增加实例数量。

It works well, but we noticed all > 5GB uploads fail.它运行良好,但我们注意到所有 > 5GB 的上传都失败了。 I know uppy has a plugin for AWS multipart uploads, but even when installed during the container image creation, the result is the same.我知道 uppy 有一个用于 AWS 分段上传的插件,但即使在创建容器映像期间安装,结果也是一样的。

Here's our Dockerfile.这是我们的 Dockerfile。 Has someone ever succeeded in uploading > 5GB files to S3 via uppy?有没有人通过 uppy 成功地将 > 5GB 的文件上传到 S3? IS there anything we're missing?我们有什么遗漏吗?

FROM node:alpine AS companion
RUN yarn global add @uppy/companion@3.0.1
RUN yarn global add @uppy/aws-s3-multipart
ARG UPPY_COMPANION_DOMAIN=[...redacted..]
ARG UPPY_AWS_BUCKET=[...redacted..]


ENV COMPANION_SECRET=[...redacted..]
ENV COMPANION_PREAUTH_SECRET=[...redacted..]
ENV COMPANION_DOMAIN=${UPPY_COMPANION_DOMAIN}
ENV COMPANION_PROTOCOL="https"
ENV COMPANION_DATADIR="COMPANION_DATA"
# ENV COMPANION_HIDE_WELCOME="true"
# ENV COMPANION_HIDE_METRICS="true"
ENV COMPANION_CLIENT_ORIGINS=[...redacted..]
ENV COMPANION_AWS_KEY=[...redacted..]
ENV COMPANION_AWS_SECRET=[...redacted..]
ENV COMPANION_AWS_BUCKET=${UPPY_AWS_BUCKET}
ENV COMPANION_AWS_REGION="us-east-2"
ENV COMPANION_AWS_USE_ACCELERATE_ENDPOINT="true"
ENV COMPANION_AWS_EXPIRES="3600"
ENV COMPANION_AWS_ACL="public-read"
# We don't need to store data for just S3 uploads, but Uppy throws unless this dir exists.
RUN mkdir COMPANION_DATA

CMD ["companion"]

EXPOSE 3020

EDIT:编辑:

I made sure I had:我确定我有:

uppy.use(AwsS3Multipart, {
  limit: 5,
  companionUrl: '<our uppy url',
})

And it still doesn't work- I see all the chunks of the 9GB file sent on the network tab but as soon as it hits 100% -- uppy throws an error "cannot post" (to our S3 url) and that's it.它仍然无法正常工作——我看到在网络选项卡上发送的 9GB 文件的所有块,但是一旦它达到 100%——uppy 就会抛出一个错误“无法发布”(到我们的 S3 url),就是这样。 failure.失败。

Has anyone ever encountered this?有没有人遇到过这个? upload goes fine till 100%, then the last chunk gets HTTP error 413 , making the entire upload fail.上传一直正常到 100%,然后最后一个块得到HTTP error 413 ,使整个上传失败。

在此处输入图像描述

Thanks!谢谢!

In the AWS S3 service in a single PUT operation, you can upload a single object up to 5 GB in size.在单个 PUT 操作中的 AWS S3 服务中,您可以上传最大为 5 GB 的单个对象。

To upload > 5GB files to S3 you need to use the multipart upload S3 API, and also the AwsS3Multipart Uppy API.要将大于 5GB 的文件上传到 S3,您需要使用分段上传 S3 API 以及AwsS3Multipart Uppy API。

Check your upload code to understand if you are using AWSS3Multipart correctly, setting the limit properly for example, in this case a limit between 5 and 15 is recommended.检查您的上传代码以了解您是否正确使用 AWSS3Multipart,例如正确设置限制,在这种情况下,建议使用 5 到 15 之间的限制。

import AwsS3Multipart from '@uppy/aws-s3-multipart'

uppy.use(AwsS3Multipart, {
  limit: 5,
  companionUrl: 'https://uppy-companion.myapp.net/',
})

Also, check this issue on Github Uploading a large >5GB file to S3 errors out #1945另外,请在 Github 上检查此问题,将大于 5GB 的大文件上传到 S3 错误 #1945

Here I'm adding some code samples from my repository that will help you to understand the flow of using the BUSBOY package to stream the data to the S3 bucket.在这里,我从我的存储库中添加了一些代码示例,这将帮助您了解使用 BUSBOY 包将数据流式传输到 S3 存储桶的流程。 Also, I'm adding the reference links here for you to get the package details I'm using.另外,我在此处添加参考链接,以便您获取我正在使用的包详细信息。

https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/clients/client-s3/index.html https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/clients/client-s3/index.html

https://www.npmjs.com/package/busboy https://www.npmjs.com/package/busboy

export const uploadStreamFile = async (req: Request, res: Response) => {
    const busboy = new Busboy({ headers: req.headers });
    const streamResponse = await busboyStream(busboy, req);
    const uploadResponse = await s3FileUpload(streamResponse.data.buffer);
    return res.send(uploadResponse);
};

const busboyStream = async (busboy: any, req: Request): Promise<any> {
     return new Promise((resolve, reject) => {
      try {
        const fileData: any[] = [];
        let fileBuffer: Buffer;
        busboy.on('file', async (fieldName: any, file: any, fileName: any, encoding: any, mimetype: any) => {
          // ! File is missing in the request
          if (!fileName)
            reject("File not found!");

          let totalBytes: number = 0;
          file.on('data', (chunk: any) => {
            fileData.push(chunk);
            // ! given code is only for logging purpose
            // TODO will remove once project is live
            totalBytes += chunk.length;
            console.log('File [' + fieldName + '] got ' + chunk.length + ' bytes');
          });

          file.on('error', (err: any) => {
            reject(err);
          });

          file.on('end', () => {
            fileBuffer = Buffer.concat(fileData);
          });
        });

        // ? Haa, finally file parsing wen't well
        busboy.on('finish', () => {
          const responseData: ResponseDto = {
            status: true, message: "File parsing done", data: {
              buffer: fileBuffer,
              metaData
            }
          };
          resolve(responseData)
          console.log('Done parsing data! -> File uploaded');
        });
        req.pipe(busboy);
      } catch (error) {
        reject(error);
      }

    });
  }

const s3FileUpload = async (fileData: any): Promise<ResponseDto> {
    try {
      const params: any = {
        Bucket: <BUCKET_NAME>,
        Key: <path>,
        Body: fileData,
        ContentType: <content_type>,
        ServerSideEncryption: "AES256",
      }; 
      const command = new PutObjectCommand(params);
      const uploadResponse: any = await this.S3.send(command);
      return { status: true, message: "File uploaded successfully", data: uploadResponse };
    } catch (error) {
      const responseData = { status: false, message: "Monitor connection failed, please contact tech support!", error: error.message };
      return responseData;
    }
  }

If you're getting Error: request entity too large in your Companion server logs I fixed this in my Companion express server by increasing the body-parser limit:如果您收到Error: request entity too large in your Companion 服务器日志,我通过增加正文解析器限制在我的 Companion express 服务器中修复了此问题:

app.use(bodyparser.json({ limit: '21GB', type: 'application/json' }))

This is a good working example of Uppy S3 MultiPart uploads (without this limit increased): https://github.com/jhanitesh10/uppy这是 Uppy S3 MultiPart 上传的一个很好的工作示例(没有增加此限制): https ://github.com/jhanitesh10/uppy

I'm able to upload files up to a (self-imposed) limit of 20GB using this code.我可以使用此代码将文件上传到 20GB 的(自我强加的)限制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM