简体   繁体   English

带有 InputStream 长度示例的 AmazonS3 putObject

[英]AmazonS3 putObject with InputStream length example

I am uploading a file to S3 using Java - this is what I got so far:我正在使用 Java 将文件上传到 S3 - 这是我目前得到的:

AmazonS3 s3 = new AmazonS3Client(new BasicAWSCredentials("XX","YY"));

List<Bucket> buckets = s3.listBuckets();

s3.putObject(new PutObjectRequest(buckets.get(0).getName(), fileName, stream, new ObjectMetadata()));

The file is being uploaded but a WARNING is raised when I am not setting the content length:文件正在上传,但当我没有设置内容长度时会引发警告:

 com.amazonaws.services.s3.AmazonS3Client putObject: No content length specified for stream > data. Stream contents will be buffered in memory and could result in out of memory errors.

This is a file I am uploading and the stream variable is an InputStream , from which I can get the byte array like this: IOUtils.toByteArray(stream) .这是我正在上传的文件, stream变量是InputStream ,我可以从中获取字节数组,如下所示: IOUtils.toByteArray(stream)

So when I try to set the content length and MD5 (taken from here ) like this:因此,当我尝试像这样设置内容长度和 MD5(取自此处)时:

// get MD5 base64 hash
MessageDigest messageDigest = MessageDigest.getInstance("MD5");
messageDigest.reset();
messageDigest.update(IOUtils.toByteArray(stream));
byte[] resultByte = messageDigest.digest();
String hashtext = new String(Hex.encodeHex(resultByte));

ObjectMetadata meta = new ObjectMetadata();
meta.setContentLength(IOUtils.toByteArray(stream).length);
meta.setContentMD5(hashtext);

It causes the following error to come back from S3:它会导致从 S3 返回以下错误:

The Content-MD5 you specified was invalid.您指定的 Content-MD5 无效。

What am I doing wrong?我究竟做错了什么?

Any help appreciated!任何帮助表示赞赏!

PS I am on Google App Engine - I cannot write the file to disk or create a temp file because AppEngine does not support FileOutputStream. PS我在 Google App Engine 上 - 我无法将文件写入磁盘或创建临时文件,因为 AppEngine 不支持 FileOutputStream。

Because the original question was never answered, and I had to run into this same problem, the solution for the MD5 problem is that S3 doesn't want the Hex encoded MD5 string we normally think about.因为最初的问题从来没有得到回答,我不得不遇到同样的问题,MD5 问题的解决方案是 S3 不想要我们通常想到的十六进制编码的 MD5 字符串。

Instead, I had to do this.相反,我不得不这样做。

// content is a passed in InputStream
byte[] resultByte = DigestUtils.md5(content);
String streamMD5 = new String(Base64.encodeBase64(resultByte));
metaData.setContentMD5(streamMD5);

Essentially what they want for the MD5 value is the Base64 encoded raw MD5 byte-array, not the Hex string.本质上,他们想要的 MD5 值是 Base64 编码的原始 MD5 字节数组,而不是十六进制字符串。 When I switched to this it started working great for me.当我切换到这个时,它开始对我很有用。

If all you are trying to do is solve the content length error from amazon then you could just read the bytes from the input stream to a Long and add that to the metadata.如果您要做的只是解决来自亚马逊的内容长度错误,那么您只需将输入流中的字节读取到 Long 并将其添加到元数据中。

/*
 * Obtain the Content length of the Input stream for S3 header
 */
try {
    InputStream is = event.getFile().getInputstream();
    contentBytes = IOUtils.toByteArray(is);
} catch (IOException e) {
    System.err.printf("Failed while reading bytes from %s", e.getMessage());
} 

Long contentLength = Long.valueOf(contentBytes.length);

ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentLength(contentLength);

/*
 * Reobtain the tmp uploaded file as input stream
 */
InputStream inputStream = event.getFile().getInputstream();

/*
 * Put the object in S3
 */
try {

    s3client.putObject(new PutObjectRequest(bucketName, keyName, inputStream, metadata));

} catch (AmazonServiceException ase) {
    System.out.println("Error Message:    " + ase.getMessage());
    System.out.println("HTTP Status Code: " + ase.getStatusCode());
    System.out.println("AWS Error Code:   " + ase.getErrorCode());
    System.out.println("Error Type:       " + ase.getErrorType());
    System.out.println("Request ID:       " + ase.getRequestId());
} catch (AmazonClientException ace) {
    System.out.println("Error Message: " + ace.getMessage());
} finally {
    if (inputStream != null) {
        inputStream.close();
    }
}

You'll need to read the input stream twice using this exact method so if you are uploading a very large file you might need to look at reading it once into an array and then reading it from there.您需要使用这种精确的方法读取输入流两次,因此如果您要上传一个非常大的文件,您可能需要查看一次将其读取到数组中,然后从那里读取它。

For uploading, the S3 SDK has two putObject methods:对于上传,S3 SDK 有两个 putObject 方法:

PutObjectRequest(String bucketName, String key, File file)

and

PutObjectRequest(String bucketName, String key, InputStream input, ObjectMetadata metadata)

The inputstream+ObjectMetadata method needs a minimum metadata of Content Length of your inputstream. inputstream+ObjectMetadata 方法需要输入流的内容长度的最小元数据。 If you don't, then it will buffer in-memory to get that information, this could cause OOM.如果不这样做,它将在内存中缓冲以获取该信息,这可能会导致 OOM。 Alternatively, you could do your own in-memory buffering to get the length, but then you need to get a second inputstream.或者,您可以执行自己的内存中缓冲来获取长度,但随后您需要获得第二个输入流。

Not asked by the OP (limitations of his environment), but for someone else, such as me.不是由 OP(他的环境限制)询问的,而是由其他人询问的,例如我。 I find it easier, and safer (if you have access to temp file), to write the inputstream to a temp file, and put the temp file.我发现将输入流写入临时文件并放置临时文件更容易也更安全(如果您可以访问临时文件)。 No in-memory buffer, and no requirement to create a second inputstream.没有内存缓冲区,也不需要创建第二个输入流。

AmazonS3 s3Service = new AmazonS3Client(awsCredentials);
File scratchFile = File.createTempFile("prefix", "suffix");
try {
    FileUtils.copyInputStreamToFile(inputStream, scratchFile);    
    PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, id, scratchFile);
    PutObjectResult putObjectResult = s3Service.putObject(putObjectRequest);

} finally {
    if(scratchFile.exists()) {
        scratchFile.delete();
    }
}

While writing to S3, you need to specify the length of S3 object to be sure that there are no out of memory errors.在写入 S3 时,您需要指定 S3 对象的长度以确保没有内存不足错误。

Using IOUtils.toByteArray(stream) is also prone to OOM errors because this is backed by ByteArrayOutputStream使用IOUtils.toByteArray(stream)也容易出现 OOM 错误,因为这是由 ByteArrayOutputStream 支持的

So, the best option is to first write the inputstream to a temp file on local disk and then use that file to write to S3 by specifying the length of temp file.因此,最好的选择是首先将输入流写入本地磁盘上的临时文件,然后通过指定临时文件的长度使用该文件写入 S3。

i am actually doing somewhat same thing but on my AWS S3 storage:-我实际上在做一些同样的事情,但在我的 AWS S3 存储上:-

Code for servlet which is receiving uploaded file:-接收上传文件的 servlet 代码:-

import java.io.IOException;
import java.io.PrintWriter;
import java.util.List;

import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import org.apache.commons.fileupload.FileItem;
import org.apache.commons.fileupload.disk.DiskFileItemFactory;
import org.apache.commons.fileupload.servlet.ServletFileUpload;

import com.src.code.s3.S3FileUploader;

public class FileUploadHandler extends HttpServlet {

    protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        doPost(request, response);
    }

    protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        PrintWriter out = response.getWriter();

        try{
            List<FileItem> multipartfiledata = new ServletFileUpload(new DiskFileItemFactory()).parseRequest(request);

            //upload to S3
            S3FileUploader s3 = new S3FileUploader();
            String result = s3.fileUploader(multipartfiledata);

            out.print(result);
        } catch(Exception e){
            System.out.println(e.getMessage());
        }
    }
}

Code which is uploading this data as AWS object:-将此数据上传为 AWS 对象的代码:-

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.util.List;
import java.util.UUID;

import org.apache.commons.fileupload.FileItem;

import com.amazonaws.AmazonClientException;
import com.amazonaws.AmazonServiceException;
import com.amazonaws.auth.ClasspathPropertiesFileCredentialsProvider;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.ObjectMetadata;
import com.amazonaws.services.s3.model.PutObjectRequest;
import com.amazonaws.services.s3.model.S3Object;

public class S3FileUploader {


    private static String bucketName     = "***NAME OF YOUR BUCKET***";
    private static String keyName        = "Object-"+UUID.randomUUID();

    public String fileUploader(List<FileItem> fileData) throws IOException {
        AmazonS3 s3 = new AmazonS3Client(new ClasspathPropertiesFileCredentialsProvider());
        String result = "Upload unsuccessfull because ";
        try {

            S3Object s3Object = new S3Object();

            ObjectMetadata omd = new ObjectMetadata();
            omd.setContentType(fileData.get(0).getContentType());
            omd.setContentLength(fileData.get(0).getSize());
            omd.setHeader("filename", fileData.get(0).getName());

            ByteArrayInputStream bis = new ByteArrayInputStream(fileData.get(0).get());

            s3Object.setObjectContent(bis);
            s3.putObject(new PutObjectRequest(bucketName, keyName, bis, omd));
            s3Object.close();

            result = "Uploaded Successfully.";
        } catch (AmazonServiceException ase) {
           System.out.println("Caught an AmazonServiceException, which means your request made it to Amazon S3, but was "
                + "rejected with an error response for some reason.");

           System.out.println("Error Message:    " + ase.getMessage());
           System.out.println("HTTP Status Code: " + ase.getStatusCode());
           System.out.println("AWS Error Code:   " + ase.getErrorCode());
           System.out.println("Error Type:       " + ase.getErrorType());
           System.out.println("Request ID:       " + ase.getRequestId());

           result = result + ase.getMessage();
        } catch (AmazonClientException ace) {
           System.out.println("Caught an AmazonClientException, which means the client encountered an internal error while "
                + "trying to communicate with S3, such as not being able to access the network.");

           result = result + ace.getMessage();
         }catch (Exception e) {
             result = result + e.getMessage();
       }

        return result;
    }
}

Note :- I am using aws properties file for credentials.注意:- 我使用 aws 属性文件作为凭据。

Hope this helps.希望这可以帮助。

我创建了一个在后台使用分段上传的库,以避免缓冲内存中的所有内容,也不写入磁盘: https : //github.com/alexmojaki/s3-stream-upload

Just passing the file object to the putobject method worked for me.只是将文件对象传递给 putobject 方法对我有用。 If you are getting a stream, try writing it to a temp file before passing it on to S3.如果您正在获取流,请尝试将其写入临时文件,然后再将其传递给 S3。

amazonS3.putObject(bucketName, id,fileObject);

I am using Aws SDK v1.11.414我正在使用 Aws SDK v1.11.414

The answer at https://stackoverflow.com/a/35904801/2373449 helped me https://stackoverflow.com/a/35904801/2373449 上的答案帮助了我

添加 log4j-1.2.12.jar 文件已为我解决了问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM