简体繁体 English

在EMR群集上运行Spark作业时发生异常“ java.io.IOException：所有数据节点均损坏”

[英]Exceptions while running Spark job on EMR cluster “java.io.IOException: All datanodes are bad”

原文 2019-04-30 16:21:34 0 1 scala/ amazon-web-services/ apache-spark/ amazon-emr

We have AWS EMR setup to process jobs which are written in Scala. 我们拥有AWS EMR设置来处理用Scala编写的作业。 We are able to run the jobs on small dataset, but while running same job on large dataset I get exception "java.io.IOException: All datanodes are bad." 我们能够在小型数据集上运行作业，但是在大型数据集上运行相同作业时，出现异常“ java.io.IOException：所有数据节点均损坏。”

1 个解决方案

Setting spark.shuffle.service.enabled to true resolved this issue for me. 将spark.shuffle.service.enabled设置为true可以为我解决此问题。

The default configuration of AWS EMR has set spark.dynamicAllocation.enabled to true, but spark.shuffle.service.enabled is set to false . AWS EMR的默认配置已将spark.dynamicAllocation.enabled设置为true，但将spark.shuffle.service.enabled设置为false 。

spark.dynamicAllocation.enabled allows Spark to assign the executors dynamically to different task. spark.dynamicAllocation.enabled允许Spark将执行程序动态分配给其他任务。 The spark.shuffle.service.enabled when set to false disables the external shuffle service and data is stored only on executors. 如果将spark.shuffle.service.enabled设置为false ，则将禁用外部随机播放服务，并且数据仅存储在执行程序上。 When the executors is reassigned the data is lost and the exception "java.io.IOException: All datanodes are bad." 重新分配执行程序后，数据将丢失，并且出现异常“ java.io.IOException：所有数据节点均损坏”。 is thrown for data request. 引发数据请求。