简体   繁体   English

我应该如何在多线程AWS S3文件上传中处理“java.net.SocketException:Connection reset”?

[英]How should I handle “java.net.SocketException: Connection reset” in multithread AWS S3 file upload?

I have a ThreadPoolExecutorService to which I'm submitting runnable jobs that are uploading large (1-2 GB) files to Amazon's S3 file system, using the AWS Java SDK. 我有一个ThreadPoolExecutorService,我正在使用AWS Java SDK向大型(1-2 GB)文件上传大型(1-2 GB)文件的可运行作业。 Occasionally one of my worker threads will report a java.net.SocketException with "Connection reset" as the cause and then die. 偶尔我的一个工作线程将报告一个java.net.SocketException,其中“Connection reset”作为原因,然后死掉。

AWS doesn't use checked exceptions so I actually can't catch SocketException directly---it must be wrapped somehow. AWS不使用已检查的异常,因此我实际上无法直接捕获SocketException ---它必须以某种方式包装。 My question is how I should deal with this problem so I can retry any problematic uploads and increase the reliability of my program. 我的问题是我应该如何处理这个问题,以便我可以重试任何有问题的上传,并提高我的程序的可靠性。

Would the Multipart Upload API be more reliable? Multipart Upload API会更可靠吗?

Is there some exception I can reliably catch to enable retries? 是否有一些我可以可靠地捕获以启用重试的异常?

Here's the stack trace. 这是堆栈跟踪。 The com.example.* code is mine. com.example。*代码是我的。 Basically what the DataProcessorAWS call does is call putObject(String bucketName, String key, File file) on an instance of AmazonS3Client that's shared across threads. 基本上,DataProcessorAWS调用的是在跨线程共享的AmazonS3Client实例上调用putObject(String bucketName, String key, File file)

14/12/11 18:43:17 INFO http.AmazonHttpClient: Unable to execute HTTP request: Connection reset
java.net.SocketException: Connection reset
    at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
    at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
    at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:377)
    at sun.security.ssl.OutputRecord.write(OutputRecord.java:363)
    at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:830)
    at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:801)
    at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:122)
    at org.apache.http.impl.io.AbstractSessionOutputBuffer.write(AbstractSessionOutputBuffer.java:169)
    at org.apache.http.impl.io.ContentLengthOutputStream.write(ContentLengthOutputStream.java:119)
    at org.apache.http.entity.InputStreamEntity.writeTo(InputStreamEntity.java:102)
    at com.amazonaws.http.RepeatableInputStreamRequestEntity.writeTo(RepeatableInputStreamRequestEntity.java:153)
    at org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
    at org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108)
    at org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122)
    at org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271)
    at org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197)
    at org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257)
    at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doSendRequest(SdkHttpRequestExecutor.java:47)
    at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
    at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:685)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3697)
    at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1434)
    at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1294)
    at com.example.DataProcessorAWS$HitWriter.close(DataProcessorAWS.java:156)
    at com.example.DataProcessorAWS$Processor.run(DataProcessorAWS.java:264)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

For that you have to use only AmazonS3Client not Trasfermanager for upload and download 为此,您必须仅使用AmazonS3Client而不是Trasfermanager进行上传和下载

U have to configure AmazonS3Client with following properties: 您必须使用以下属性配置AmazonS3Client:

1. connectionTimeout=50000 in ms
2.maxConnections=500
3.socketTimeout=50000 in ms
4.maxErrorRetry=10

For upload use 供上传使用

AmazonS3Client.putObject(bucketName, key, inputFile);

For download use 供下载使用

S3Object s3Object = AmazonS3Client.getObject(new GetObjectRequest(bucketName, key));`
InputStream downloadStream  = s3Object.getObjectContent();  

and store that stream by reading bytes. 并通过读取字节来存储该流。

'Connection reset' means the connection is hosed. “连接重置”表示连接已被连接。 Close the socket, or whatever higher-level construct you're using. 关闭套接字,或者您正在使用的任何更高级别的构造。 Probably the server has decided the upload is too large, or it's overloaded, or something. 可能是服务器已经确定上传太大,或者它已经超载,或者其他什么。 Whether you can retry the operation is something only you can know. 您是否可以重新尝试操作只是您可以知道的事情。

The javadoc says that putObject() throws AmazonClientException when this kind of error occurs. javadoc说当发生这种错误时, putObject()会抛出AmazonClientException。 AmazonClientException has a method called isRetryable() , you can try with that. AmazonClientException有一个名为isRetryable()的方法,您可以尝试使用它。

In my case, I am saving a object with some metadata. 就我而言,我正在使用一些元数据保存一个对象。 During the prototype, we added a system metadata "time-created", and that consistently resulted into the similar error as above in the question. 在原型期间,我们添加了一个“时间创建”的系统元数据,并且一直导致问题中出现类似的错误。 And it is not retry-able as it intends to modify a system metadata. 并且它不能重试,因为它打算修改系统元数据。 But according to the API, as below, 但根据API,如下所示,

Throws: 抛出:

  • SdkClientException - If any errors are encountered in the client while making > the request or handling the response. SdkClientException - 如果在进行>请求或处理响应时客户端遇到任何错误。
  • AmazonServiceException - If any errors occurred in Amazon S3 while processing > the request. AmazonServiceException - 如果在处理>请求时Amazon S3中发生任何错误。

My understanding is that the expected behavior should be, 我的理解是预期的行为应该是,

  • throw one of the above exceptions, OR, 抛出上述异常之一,OR,
  • return an 403, rather than an umbrella AmazonClientException, "Connection shutdown". 返回403,而不是一个伞AmazonClientException,“连接关闭”。

See the logs below when S3 debug is turned on, 打开S3调试时,请参阅下面的日志,

2017-07-11 17:12:45,163 DEBUG  [org.apache.http.wire] - http-outgoing-3 >> "**[write] I/O error: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Connection reset"**
2017-07-11 17:12:45,221 DEBUG  [org.apache.http.wire] - http-outgoing-4 >> "PUT /dev-ec-dev/neon/ECTE/CPER/2017-6/ECTE_dp0p_CPER_2017-06-05.h5 HTTP/1.1[\r][\n]"
2017-07-11 17:12:45,221 DEBUG  [org.apache.http.wire] - http-outgoing-4 >> "Host: s3.data.neonscience.org[\r][\n]"
2017-07-11 17:12:45,221 DEBUG  [org.apache.http.wire] - http-outgoing-4 >> "Authorization: AWS dev-ec-owner:dE6ouKtPpP9aftLRba4OVn6DH9M=[\r][\n]"
2017-07-11 17:12:45,222 DEBUG  [org.apache.http.wire] - http-outgoing-4 >> "x-amz-meta-site: CPER[\r][\n]"
2017-07-11 17:12:45,226 DEBUG  [org.apache.http.wire] - http-outgoing-4 >> "x-amz-meta-time-created: 2017-07-11T17:12:44.904[\r][\n]"

More logs with regular logging, 更多日志定期记录,

2017-07-11 17:12:45,247 DEBUG  [org.apache.http.wire] - http-outgoing-4 >> "[write] I/O error: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Connection reset"
2017-07-11 17:12:45,248 ERROR  [org.battelle.neon.is.transition.ec.ECTELevelZeroPrimeJob] - Error copyDataToS3: Not getting descriptive AmazonS3Exception.
Batch runner failed.com.amazonaws.AmazonClientException: Unable to execute HTTP request: Connection reset
java.lang.RuntimeException: com.amazonaws.AmazonClientException: Unable to execute HTTP request: Connection reset
        at org.battelle.neon.is.transition.ec.ECTELevelZeroPrimeJob.runTransition(ECTELevelZeroPrimeJob.java:347)
        at org.battelle.neon.is.batch.BatchRunner.lambda$null$1(BatchRunner.java:152)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM