简体   繁体   English

GCP数据流-SSLHandshakeException

[英]GCP Dataflow - SSLHandshakeException

I'm encountering the following when running large (>1000 cpus) and medium sized (100-1000 cpus) dataflow jobs: 在运行大型(> 1000 cpus)和中型(100-1000 cpus)数据流作业时遇到以下问题:

exception: "javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake 异常:“ javax.net.ssl.SSLHandshakeException:握手期间远程主机关闭了连接

The error is not fatal, however, once it begins it recurs roughly every 30 seconds. 该错误不是致命的,但是一旦开始,它大约每30秒重现一次。 Jobs that display this error never finish (I've waited over 4x the expected run time) and produce very limited results (less than 4% of the expected output). 显示此错误的作业永远不会完成(我已经等待了预期运行时间的4倍以上),并且产生的结果非常有限(不到预期输出的4%)。 The limited output, when produced, is received early in the job, and afterwards no more is produced. 产量有限时,会在工作开始时收到,之后不再生产。

I use both BigQueryIO and JdbcIO Apache beam sources and sinks. 我同时使用BigQueryIO和JdbcIO Apache梁源和接收器。

It is important to note that my jobs worked correctly in early June, but have begun showing this error since the beginning of July. 重要的是要注意,我的工作在6月初可以正常工作,但是自7月初以来已经开始显示此错误。

I have an open case in Google's Enterprise Support, but let's just say results are not forthcoming. 我在Google的企业支持中有一个公开的案例,但是只能说结果不理想。 The only point of interest that has been yielded from Google is that the error might occur, "in case the workers scale up and heavily access Cloud Storage". Google唯一感兴趣的一点是,“如果工作人员扩大规模并大量访问Cloud Storage,则可能会发生错误”。

However, there is no solution attached to that statement. 但是,该语句没有附加解决方案。 This is an example of the complete error as recorded in the logs: 这是记录在日志中的完整错误的示例:

   exception:  "javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
    at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1002)
    at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
    at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
    at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:93)
    at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
    at com.google.cloud.hadoop.util.ResilientOperation$AbstractGoogleClientRequestExecutor.call(ResilientOperation.java:166)
    at com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:66)
    at com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.getMetadata(GoogleCloudStorageReadChannel.java:573)
    at com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.openStreamAndSetMetadata(GoogleCloudStorageReadChannel.java:645)
    at com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.performLazySeek(GoogleCloudStorageReadChannel.java:560)
    at com.google.cloud.hadoop.gcsio.GoogleCloudStorageReadChannel.read(GoogleCloudStorageReadChannel.java:289)
    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)
    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
    at java.io.InputStream.read(InputStream.java:101)
    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
    at org.apache.beam.sdk.util.VarInt.decodeLong(VarInt.java:79)
    at org.apache.beam.sdk.util.VarInt.decodeInt(VarInt.java:63)
    at org.apache.beam.runners.dataflow.internal.IsmFormat$KeyPrefixCoder.decode(IsmFormat.java:709)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader.readKey(IsmReader.java:1001)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader.access$2000(IsmReader.java:79)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader$WithinShardIsmReaderIterator.advance(IsmReader.java:953)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader$WithinShardIsmReaderIterator.start(IsmReader.java:943)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader$IsmCacheLoader.call(IsmReader.java:581)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader$IsmCacheLoader.call(IsmReader.java:570)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader$IsmCacheLoader.call(IsmReader.java:555)
    at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4904)
    at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3628)
    at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2336)
    at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2295)
    at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2208)
    at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache.get(LocalCache.java:4053)
    at com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4899)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader.fetch(IsmReader.java:606)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader.getBlock(IsmReader.java:771)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader.access$1000(IsmReader.java:79)
    at com.google.cloud.dataflow.worker.runners.worker.IsmReader$IsmPrefixReaderIterator.get(IsmReader.java:642)
    at com.google.cloud.dataflow.worker.runners.worker.IsmSideInputReader$ListOverReaderIterators.getUsingLong(IsmSideInputReader.java:679)
    at com.google.cloud.dataflow.worker.runners.worker.IsmSideInputReader$ListOverReaderIterators.access$1300(IsmSideInputReader.java:625)
    at com.google.cloud.dataflow.worker.runners.worker.IsmSideInputReader$ListOverReaderIterators$ListIteratorOverReaderIterators.next(IsmSideInputReader.java:720)
    at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1042)
    at com.application.strategy.simulator.MainStrategySimulator$1.processElement(MainStrategySimulator.java:224)
    at com.application.strategy.simulator.MainStrategySimulator$1$auxiliary$4N23tth9.invokeProcessElement(Unknown Source)
    at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:199)
    at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:157)
    at com.google.cloud.dataflow.worker.runners.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:244)
    at com.google.cloud.dataflow.worker.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowWorkerLoggingParDoFn.processElement(DataflowWorkerLoggingParDoFn.java:47)
    at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
    at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
    at com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:198)
    at com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159)
    at com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:72)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.executeWork(DataflowWorker.java:336)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.doWork(DataflowWorker.java:295)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:242)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:123)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:103)
    at com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:90)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException: SSL peer shut down incorrectly
    at sun.security.ssl.InputRecord.read(InputRecord.java:505)
    at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
    ... 69 more

After looking at other instances of this issue online, the error only seems to happen when using Java 7. Is that your Java version? 在联机查看此问题的其他实例之后,该错误似乎仅在使用Java 7时发生。这是您的Java版本吗?

If yes, I'd suggest trying Java 8 and seeing if that fixes the issue. 如果是,我建议尝试Java 8,看看是否可以解决该问题。 Let me know if we can help further! 让我知道我们能否进一步提供帮助!

Related github issues: 相关的github问题:

https://github.com/spotify/scio/issues/604 https://github.com/spotify/scio/issues/604

https://github.com/ljader/redmine-mylyn-plugin/issues/67 https://github.com/ljader/redmine-mylyn-plugin/issues/67

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM