簡體   English   中英

在 Amazon EMR 上使用 Hadoop / s3 “無法執行 HTTP 請求:管道損壞”

[英]"Unable to execute HTTP Request: Broken Pipe" with Hadoop / s3 on Amazon EMR

我開發了一個自定義 JAR,用於在 Elastic MapReduce 中處理數據。 數據是來自 Amazon S3 的數十萬個文件。 JAR 並沒有做任何非常時髦的讀取數據的事情——它只是使用CombineFileInputFormat

當我針對少量測試數據運行作業時,一切都完美無缺。 但是,當我針對我的完整數據集運行它時,在我的工作中(隨機)一段時間,我會遇到某種似乎沒有得到正確處理的 HTTP 或套接字錯誤。

在一項工作中,我在 SYSLOG 中得到以下信息:

2015-11-16 21:47:17,504 INFO com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem (main): exhausted retry un-registered class com.amazonaws.AmazonClientException
2015-11-16 21:47:17,504 INFO org.apache.hadoop.mapreduce.JobSubmitter (main): Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1447686616083_0001

這伴隨着標准錯誤中的以下內容:

Exception in thread "main" com.amazonaws.AmazonClientException: Unable to execute HTTP request: Remote host closed connection during handshake

第二個工作在 SYSLOG 中拋出了一個類似的錯誤,但我在標准錯誤中得到了這個:

Exception in thread "main" com.amazonaws.AmazonClientException: Unable to execute HTTP request: Broken pipe

(底部包含完整的堆棧跟蹤。)

我已經為 Hadoop 2.6.0 構建了這個,並且我使用的是最新的 AWS 構建的 Hadoop 2.6.0,所以我不確定是什么導致了這些錯誤。 有人對我如何開始解決這個問題有想法嗎?

Exception in thread "main" com.amazonaws.AmazonClientException: Unable to execute HTTP request: Broken pipe
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:500)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3604)
    at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:999)
    at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:977)
    at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:199)
    at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
    at com.sun.proxy.$Proxy21.retrieveMetadata(Unknown Source)
    at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.listStatus(S3NativeFileSystem.java:907)
    at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.listStatus(S3NativeFileSystem.java:892)
    at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.listStatus(EmrFileSystem.java:343)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1498)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1505)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1505)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1524)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1569)
    at org.apache.hadoop.fs.FileSystem$4.<init>(FileSystem.java:1746)
    at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1745)
    at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1723)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:299)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:263)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:177)
    at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:493)
    at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
    at com.rw.legion.Legion.main(Legion.java:103)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.net.SocketException: Broken pipe
    at java.net.SocketOutputStream.socketWrite0(Native Method)
    at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
    at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
    at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:377)
    at sun.security.ssl.OutputRecord.write(OutputRecord.java:363)
    at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:837)
    at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:808)
    at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:679)
    at sun.security.ssl.Handshaker.sendChangeCipherSpec(Handshaker.java:999)
    at sun.security.ssl.ClientHandshaker.sendChangeCipherAndFinish(ClientHandshaker.java:1161)
    at sun.security.ssl.ClientHandshaker.serverHelloDone(ClientHandshaker.java:1073)
    at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:341)
    at sun.security.ssl.Handshaker.processLoop(Handshaker.java:901)
    at sun.security.ssl.Handshaker.process_record(Handshaker.java:837)
    at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1023)
    at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1332)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1359)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1343)
    at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:535)
    at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:403)
    at com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.connectSocket(SdkTLSSocketFactory.java:128)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:177)
    at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:304)
    at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:611)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:446)
    at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
    ... 41 more

設置amazon客戶端的客戶端配置如下,增加超時時間。 對於相同的情況,互聯網超時可能是一個問題。

configuration.setMaxErrorRetry(3);
configuration.setConnectionTimeout(50*1000);
configuration.setSocketTimeout(50*1000);
configuration.setProtocol(Protocol.HTTP);

此外,您需要檢查允許連接的認證。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM