简体   繁体   中英

AWS S3 File uploading in Java - Why directly upload InputStream consume more memory than write it to temporary file, then upload?

I want to upload a file to S3 bucket using Streaming API of commons-fileupload library. This code is for parsing the request

FileItemIterator iterStream = upload.getItemIterator(request);
while (iterStream.hasNext()) {
    FileItemStream item = iterStream.next();
    String name = item.getFieldName();
    InputStream stream = item.openStream();
    if (!item.isFormField()) {
        // Process the InputStream stream (*)
    } else {
        String formFieldValue = Streams.asString(stream);
    }
}

This one is for initializing S3 client and transfer manager

s3Client = AmazonS3ClientBuilder.standard()
                .withRegion(Regions.DEFAULT_REGION)
                .withCredentials(new ProfileCredentialsProvider())
                .build();
transferManager = TransferManagerBuilder.standard()
                .withS3Client(s3Client)
                .build();

I used a 100MB file to test. At the beginning, my springboot app started with about 95MB ram usage. When using that stream(*) to upload to s3 bucket, using

        Upload upload = transferManager.upload(bucketName, key, inputStream, metadata );

is significantly more memory consumption (from 90MB->370MB) than copy stream (*) to OutputStream, then upload a file created from that outputStream (from 90MB-> 100MB)

try (
      OutputStream out = new FileOutputStream(fileName);
) {
      IOUtils.copy(inputStream, out);
}
PutObjectRequest request = new PutObjectRequest(
                existingBucketName, fileName, new File(fileName));
Upload upload = transferManager.upload(request);

I wonder why is that. What happened to the inputStream that make it consume more memory to directly upload? Thank you very much

If you don't know the size of the file, just use file upload. InputStream upload should be avoided. Refer to this,

When uploading options from a stream, callers must supply the size of options in the stream through the content length field in the ObjectMetadata parameter. If no content length is specified for the input stream, then TransferManager will attempt to buffer all the stream contents in memory and upload the options as a traditional, single part upload. Because the entire stream contents must be buffered in memory, this can be very expensive, and should be avoided whenever possible.

https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html#upload-java.lang.String-java.lang.String-java.io.InputStream-com.amazonaws.services.s3.model.ObjectMetadata-

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM