如何使用Java获取上载到Amazon S3的文件的进度状态

Question

I'm uploading multiple files to Amazon S3 using Java. 我正在使用Java将多个文件上传到Amazon S3。

The code I'm using is as follows: 我正在使用的代码如下：

MultipartHttpServletRequest multipartRequest = (MultipartHttpServletRequest) request;
MultiValueMap < String,
MultipartFile > map = multipartRequest.getMultiFileMap();
try {
    if (map != null) {
        for (String filename: map.keySet()) {
            List < MultipartFile > fileList = map.get(filename);
            incrPercentge = 100 / fileList.size();
            request.getSession().setAttribute("incrPercentge", incrPercentge);
            for (MultipartFile mpf: fileList) {

                /*
         * custom input stream wrap to original input stream to get
         * the progress
         */
                ProgressInputStream inputStream = new ProgressInputStream("test", mpf.getInputStream(), mpf.getBytes().length);
                ObjectMetadata metadata = new ObjectMetadata();
                metadata.setContentType(mpf.getContentType());
                String key = Util.getLoginUserName() + "/" + mpf.getOriginalFilename();
                PutObjectRequest putObjectRequest = new PutObjectRequest(
                Constants.S3_BUCKET_NAME, key, inputStream, metadata).withStorageClass(StorageClass.ReducedRedundancy);
                PutObjectResult response = s3Client.putObject(putObjectRequest);

            }
        }
    }
} catch(Exception e) {
    e.printStackTrace();
}

I have to create the custom input stream to get the number byte consumed by Amazon S3. 我必须创建自定义输入流以获取Amazon S3消耗的数字字节。 I got that idea from the question here: Upload file or InputStream to S3 with a progress callback 我从这里的问题中得到了这个想法：使用进度回调将文件或InputStream上传到S3

My ProgressInputStream class code is as follows: 我的ProgressInputStream类代码如下：

package com.spectralnetworks.net.util;
import java.io.IOException;
import java.io.InputStream;

import org.apache.commons.vfs.FileContent;
import org.apache.commons.vfs.FileSystemException;

public class ProgressInputStream extends InputStream {
    private final long size;
    private long progress,
    lastUpdate = 0;
    private final InputStream inputStream;
    private final String name;
    private boolean closed = false;

    public ProgressInputStream(String name, InputStream inputStream, long size) {
        this.size = size;
        this.inputStream = inputStream;
        this.name = name;
    }

    public ProgressInputStream(String name, FileContent content)
    throws FileSystemException {
        this.size = content.getSize();
        this.name = name;
        this.inputStream = content.getInputStream();
    }

    @Override
    public void close() throws IOException {
        super.close();
        if (closed) throw new IOException("already closed");
        closed = true;
    }

    @Override
    public int read() throws IOException {
        int count = inputStream.read();
        if (count > 0) progress += count;
        lastUpdate = maybeUpdateDisplay(name, progress, lastUpdate, size);
        return count;
    }@Override
    public int read(byte[] b, int off, int len) throws IOException {
        int count = inputStream.read(b, off, len);
        if (count > 0) progress += count;
        lastUpdate = maybeUpdateDisplay(name, progress, lastUpdate, size);
        return count;
    }

    /**
     * This is on reserach to show a progress bar
     * @param name
     * @param progress
     * @param lastUpdate
     * @param size
     * @return
     */
    static long maybeUpdateDisplay(String name, long progress, long lastUpdate, long size) {
        /* if (Config.isInUnitTests()) return lastUpdate;
        if (size < B_IN_MB/10) return lastUpdate;
        if (progress - lastUpdate > 1024 * 10) {
            lastUpdate = progress;
            int hashes = (int) (((double)progress / (double)size) * 40);
            if (hashes > 40) hashes = 40;
            String bar = StringUtils.repeat("#",
                    hashes);
            bar = StringUtils.rightPad(bar, 40);
            System.out.format("%s [%s] %.2fMB/%.2fMB\r",
                    name, bar, progress / B_IN_MB, size / B_IN_MB);
            System.out.flush();
        }*/
        System.out.println("name " + name + "  progress " + progress + " lastUpdate " + lastUpdate + " " + "sie " + size);
        return lastUpdate;
    }
}

But this is not working properly. 但这不能正常工作。 It is printing immediately up to the file size as follows: 它立即打印到文件大小，如下所示：

name test  progress 4096 lastUpdate 0 sie 30489
name test  progress 8192 lastUpdate 0 sie 30489
name test  progress 12288 lastUpdate 0 sie 30489
name test  progress 16384 lastUpdate 0 sie 30489
name test  progress 20480 lastUpdate 0 sie 30489
name test  progress 24576 lastUpdate 0 sie 30489
name test  progress 28672 lastUpdate 0 sie 30489
name test  progress 30489 lastUpdate 0 sie 30489
name test  progress 30489 lastUpdate 0 sie 30489

And the actual uploading is taking more time (more than 10 times after printing the lines). 实际上传需要更多时间（打印线后超过10次）。

What I should do so that I can get a true upload status? 我该怎么做才能获得真正的上传状态？

Answer 1

I got the answer of my questions the best way get the true progress status by using below code 我通过使用下面的代码得到了我的问题的答案，以获得真正的进展状态

ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType(mpf.getContentType());

String key = Util.getLoginUserName() + "/"
        + mpf.getOriginalFilename();
metadata.setContentLength(mpf.getSize());
PutObjectRequest putObjectRequest = new PutObjectRequest(
                Constants.S3_BUCKET_NAME, key, mpf.getInputStream(),
                metadata)
        .withStorageClass(StorageClass.ReducedRedundancy);

putObjectRequest.setProgressListener(new ProgressListener() {
        @Override
        public void progressChanged(ProgressEvent progressEvent) {
            System.out.println(progressEvent
                    .getBytesTransfered()
                    + ">> Number of byte transfered "
                    + new Date());
            progressEvent.getBytesTransfered();
            double totalByteRead = request
                    .getSession().getAttribute(
                                                    Constants.TOTAL_BYTE_READ) != null ? (Double) request
                                            .getSession().getAttribute(Constants.TOTAL_BYTE_READ) : 0;

            totalByteRead += progressEvent.getBytesTransfered();
            request.getSession().setAttribute(Constants.TOTAL_BYTE_READ, totalByteRead);
            System.out.println("total Byte read "+ totalByteRead);

            request.getSession().setAttribute(Constants.TOTAL_PROGRESS, (totalByteRead/size)*100);
        System.out.println("percentage completed >>>"+ (totalByteRead/size)*100);   
        if (progressEvent.getEventCode() == ProgressEvent.COMPLETED_EVENT_CODE) {
            System.out.println("completed  ******");
        }
    }
});
s3Client.putObject(putObjectRequest);

The problem with my previous code was , I was not setting the content length in meta data so i was not getting the true progress status. 我以前的代码的问题是，我没有在元数据中设置内容长度，所以我没有得到真正的进展状态。 The below line is copy from PutObjectRequest class API 以下行是PutObjectRequest类API的副本

Constructs a new PutObjectRequest object to upload a stream of data to the specified bucket and key. 构造一个新的PutObjectRequest对象，以将数据流上载到指定的存储桶和密钥。 After constructing the request, users may optionally specify object metadata or a canned ACL as well. 在构造请求之后，用户可以可选地指定对象元数据或固定ACL。

Content length for the data stream must be specified in the object metadata parameter; 必须在对象元数据参数中指定数据流的内容长度; Amazon S3 requires it be passed in before the data is uploaded. Amazon S3要求在上传数据之前传递它。 Failure to specify a content length will cause the entire contents of the input stream to be buffered locally in memory so that the content length can be calculated, which can result in negative performance problems. 未指定内容长度将导致输入流的全部内容在本地缓冲在存储器中，从而可以计算内容长度，这可能导致负面的性能问题。

Answer 2

I going to assume you are using the AWS SDK for Java. 我假设您正在使用AWS SDK for Java。

Your code is working as it should: It shows read is being called with 4K being read each time. 您的代码正常工作：它显示正在调用读取，每次都读取4K。 Your idea (updated in the message) is also correct: The AWS SDK provides ProgressListener as a way to inform the application of progress in the upload. 您的想法（在消息中更新）也是正确的：AWS SDK提供了ProgressListener作为向应用程序通知上载进度的方法。

The "problem" is in the implementation of the AWS SDK it is buffering more than the ~30K size of your file (I'm going to assume it's 64K) so you're not getting any progress reports. “问题”在于AWS SDK的实现，它缓冲的文件大小超过文件的大约30K（我假设它是64K）所以你没有得到任何进度报告。

Try to upload a bigger file (say 1M) and you'll see both methods give you better results, after all with today's network speeds reporting the progress on a 30K file is not even worth it. 尝试上传一个更大的文件（比如1M），你会看到这两种方法都能给你带来更好的结果，毕竟今天的网络速度报告30K文件的进展甚至都不值得。

If you want better control you could implement the upload yourself using the S3 REST interface (which is what the AWS Java SDK ultimately uses) it is not very difficult, but it is a bit of work. 如果您想要更好的控制，您可以使用S3 REST接口（这是AWS Java SDK最终使用的）自己实现上传，这不是很困难，但它有点工作。 If you want to go this route I recommend finding an example for computing the session authorization token instead of doing it yourself (sorry my search foo is not strong enough for a link to actual sample code right now.) However once you go to all that trouble you'll find that you actually want to have a 64K buffer on the socket stream to ensure maximum throughput in a fast network (which is probably why the AWS Java SDK behaves as it does.) 如果你想走这条路线，我建议你找一个计算会话授权令牌的例子，而不是自己动手做（对不起，我的搜索foo现在还不足以链接到实际的示例代码。）但是一旦你去了所有这些麻烦你会发现你真的想在套接字流上有一个64K的缓冲区来确保快速网络中的最大吞吐量（这可能就是AWS Java SDK的行为原因）。

如何使用Java获取上载到Amazon S3的文件的进度状态

问题描述

2 个解决方案

解决方案1
8 已采纳 2012-08-14 14:53:12

解决方案2
1 2012-08-13 02:54:37

如何使用Java获取上载到Amazon S3的文件的进度状态

问题描述

2 个解决方案

解决方案1 8 已采纳 2012-08-14 14:53:12

解决方案2 1 2012-08-13 02:54:37

解决方案1
8 已采纳 2012-08-14 14:53:12

解决方案2
1 2012-08-13 02:54:37