简体   繁体   English

在 Java 中上传 AWS S3 文件 - 为什么直接上传 InputStream 消耗更多 memory 比将其写入临时文件然后上传?

[英]AWS S3 File uploading in Java - Why directly upload InputStream consume more memory than write it to temporary file, then upload?

I want to upload a file to S3 bucket using Streaming API of commons-fileupload library.我想使用 commons-fileupload 库的 Streaming API 将文件上传到 S3 存储桶。 This code is for parsing the request此代码用于解析请求

FileItemIterator iterStream = upload.getItemIterator(request);
while (iterStream.hasNext()) {
    FileItemStream item = iterStream.next();
    String name = item.getFieldName();
    InputStream stream = item.openStream();
    if (!item.isFormField()) {
        // Process the InputStream stream (*)
    } else {
        String formFieldValue = Streams.asString(stream);
    }
}

This one is for initializing S3 client and transfer manager这个用于初始化 S3 客户端和传输管理器

s3Client = AmazonS3ClientBuilder.standard()
                .withRegion(Regions.DEFAULT_REGION)
                .withCredentials(new ProfileCredentialsProvider())
                .build();
transferManager = TransferManagerBuilder.standard()
                .withS3Client(s3Client)
                .build();

I used a 100MB file to test.我使用了一个 100MB 的文件进行测试。 At the beginning, my springboot app started with about 95MB ram usage.一开始,我的 springboot 应用程序使用大约 95MB 的内存。 When using that stream(*) to upload to s3 bucket, using当使用该流(*)上传到 s3 存储桶时,使用

        Upload upload = transferManager.upload(bucketName, key, inputStream, metadata );

is significantly more memory consumption (from 90MB->370MB) than copy stream (*) to OutputStream, then upload a file created from that outputStream (from 90MB-> 100MB) memory 消耗(从 90MB->370MB)比将 stream (*) 复制到 OutputStream,然后上传从该 outputStream 创建的文件(从 90MB-> 100MB)要多得多

try (
      OutputStream out = new FileOutputStream(fileName);
) {
      IOUtils.copy(inputStream, out);
}
PutObjectRequest request = new PutObjectRequest(
                existingBucketName, fileName, new File(fileName));
Upload upload = transferManager.upload(request);

I wonder why is that.我想知道为什么会这样。 What happened to the inputStream that make it consume more memory to directly upload? inputStream 发生了什么导致它消耗更多的 memory 直接上传? Thank you very much非常感谢

If you don't know the size of the file, just use file upload.如果您不知道文件的大小,只需使用文件上传即可。 InputStream upload should be avoided.应避免 InputStream 上传。 Refer to this,参考这个,

When uploading options from a stream, callers must supply the size of options in the stream through the content length field in the ObjectMetadata parameter.从 stream 上传选项时,调用者必须通过 ObjectMetadata 参数中的内容长度字段提供 stream 中选项的大小。 If no content length is specified for the input stream, then TransferManager will attempt to buffer all the stream contents in memory and upload the options as a traditional, single part upload.如果没有为输入 stream 指定内容长度,则 TransferManager 将尝试缓冲 memory 中的所有 stream 内容,并将选项上传为传统的单部分上传。 Because the entire stream contents must be buffered in memory, this can be very expensive, and should be avoided whenever possible.因为整个 stream 内容必须在 memory 中缓冲,这可能非常昂贵,应尽可能避免。

https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html#upload-java.lang.String-java.lang.String-java.io.InputStream-com.amazonaws.services.s3.model.ObjectMetadata- https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html#upload-java.lang.String-java.lang.String-java.io。 InputStream-com.amazonaws.services.s3.model.ObjectMetadata-

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM