简体   繁体   English

使用 Spark 从大型 Cassandra 表读取错误,获取“远程 RPC 客户端已解除关联”

[英]Error reading from large Cassandra table with Spark, getting "Remote RPC client disassociated"

I set stand alone spark cluster (with cassandra) and i did it but when i read data i get error.My cluster has 3 nodes and each node has 64 GB ram and 20 cores.我设置了独立的 spark 集群(使用 cassandra)并且我做到了,但是当我读取数据时出现错误。我的集群有 3 个节点,每个节点有 64 GB 内存和 20 个内核。 I'm sharing some Spark-env.sh configuration like spark_executor_cores: 5, spark_executor_memory:5G, spark_worker_cores:20 and spark_worker_memory:45g.我正在分享一些 Spark-env.sh 配置,例如 spark_executor_cores: 5、spark_executor_memory:5G、spark_worker_cores:20 和 spark_worker_memory:45g。

I want to give another information, when i read small table there is no problem but when i read big table i get error.我想提供另一个信息,当我读取小表时没有问题,但是当我读取大表时出现错误。 Error description at below.错误描述如下。 Also when i start pyspark i use this command:此外,当我启动 pyspark 时,我使用以下命令:

$ ./pyspark --master spark://10.0.0.100:7077
    --packages com.datastax.spark:spark-cassandra-connector_2.12:3.1.0
    --conf spark.driver.extraJavaOptions=-Xss1024m
    --conf spark.driver.port:36605
    --conf spark.driver.blockManager.port=42365

Thanks for your interest感谢您的关注

ERROR TaskSchedulerImpl: Lost executor 5 on 10.0.0.10: Remote RPC client disassociated. likely due to containers exceeding threshold, or network issues. Chec driver logs for WARN messages
WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) (10.0.0.10 executor 5): ExecutorLostFailure (executor 5 exited caused by one of the runnning task) reason: remote RPC client disassociated.
WARN TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1) (10.0.0.11 executor 2):Java.lang.StackOverflowError
 at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1094)
 at java.base/java.nio.HeapByteBuffer.get(HeapByteBuffer.java:184)
 at org.apache.spark.util.ByteBufferInputStream.read(ObjectInputStream.scala:49)
 at java.base/java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2887)
 at java.base/java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2903)
 at java.base/java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3678) 
 at java.base/java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:3678)
at java.base/java.io.ObjectInputStream.readString(ObjectInputStream.java:2058)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1663)
at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2490)
at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2384)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2222)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1681)
at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2490)
at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2384)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2222)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1681)
at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2490)
at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2384)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2222)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1681)
at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2490)
at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2384)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2222)

The problem you're running into is most likely a networking issue.您遇到的问题很可能是网络问题。

It's highly unusual that you need to pin the driver ports with:您需要使用以下方式固定驱动程序端口是非常不寻常的:

    --conf spark.driver.port:36605
    --conf spark.driver.blockManager.port=42365

You'll need to provide background information on why you're doing this.您需要提供有关您为什么这样做的背景信息。

Also as I previously advised you on another question last week, you need to provide the minimal code + minimal configuration that replicates the problem.同样,正如我上周在另一个问题上建议您的那样,您需要提供复制问题的最少代码 + 最少配置。 Otherwise, there isn't enough information for others to be able to help you.否则,没有足够的信息让其他人能够帮助您。 Cheers!干杯!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Spark - 远程Akka客户端解除关联 - Spark - Remote Akka Client Disassociated Spark ExecutorLostFailure- 原因:远程 RPC 客户端已解除关联。 可能是由于容器超过阈值或网络问题 - Spark ExecutorLostFailure- Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues Databricks Spark Pyspark RDD 重新分区 - “远程 RPC 客户端已解除关联。 可能是由于容器超过阈值或网络问题。” - Databricks Spark Pyspark RDD Repartition - “Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues.” DSE Spark独立群集启动应用程序'远程Akka客户端解除关联'错误 - DSE Spark stand alone cluster on launching application 'remote Akka client disassociated' error 从Spark Worker读取和写入Cassandra会引发错误 - Reading and writing to Cassandra from Spark worker throws error Spark:与远程系统的关联失败。 原因:解散了 - Spark: Association with remote system failed. Reason: Disassociated Spark:与远程系统的关联丢失了akka.tcp(解除关联) - Spark: Association with remote system lost akka.tcp (disassociated) 使用 Spark 3.0 读取 Cassandra TTL 和 WRITETIME 时出错 - Error reading Cassandra TTL and WRITETIME with Spark 3.0 从Spark中的cassandra表中删除 - Delete from cassandra Table in Spark 使用 Spark Cassandra 连接器时获取从 Cassandra 表读取的记录数 - Getting the number of records read from Cassandra table while using Spark Cassandra Connector
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM