简体繁体 English

HDinsight上运行的Spark中的故障恢复

[英]Failure recovery in spark running on HDinsight

原文 2015-04-07 18:34:17 0 1 azure/ apache-spark/ master-slave

I was trying to get Apache spark run on Azure HDinsight by following the steps from http://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-spark-install/ 我正尝试按照http://azure.microsoft.com/zh-cn/documentation/articles/hdinsight-hadoop-spark-install/中的步骤在Azure HDinsight上运行Apache Spark。

I was wondering if I have to manage the master/slave failure recovery myself, or will HDinsight take care of it. 我想知道是否必须自己管理主/从故障恢复，否则HDinsight会照顾它。

1 个解决方案

I'm also working on Spark Streaming applications on Azure HDInsight. 我还在Azure HDInsight上的Spark Streaming应用程序上工作。 Inside the Spark job, Spark and Yarn can provide some Fault-Tolerance for Master and Slave. 在Spark作业中，Spark和Yarn可以为Master和Slave提供一些容错功能。

But sometimes, the driver and worker will also crash by the user-code error, spark internal issues, and Azure HDInsight issues. 但是有时，驱动程序和工作程序也会由于用户代码错误，引发内部问题以及Azure HDInsight问题而崩溃。 So, we need to make our own monitoring/daemon process , and maintain the recovery . 因此，我们需要进行自己的监视/守护进程 ，并保持恢复 。
For Streaming Scenarios, it's even harder. 对于流方案，它甚至更难。 As the Spark Streaming Job which need keep 7*24 running, there's the concern that how to keep the job recovery for the machine reboot and reimage . 作为需要保持7 * 24运行的Spark Streaming Job，存在着一种担忧，即如何在计算机重新启动和重新映像时保持作业恢复。

从HDInsight群集头节点运行Spark应用程序 - Running spark application from HDInsight cluster headnode

使用IntelliJ IDEA失败提交hdinsight Spark作业 - submit hdinsight Spark job using IntelliJ IDEA failure

使用本地驱动器在Azure HDInsight上运行Spark程序 - Running Spark program on Azure HDInsight using local drive

在Azure HdInsight的Linux群集上的Spark中运行Zeppelin段落时出错 - Error while running Zeppelin paragraphs in Spark on Linux cluster in Azure HdInsight

如何运行HDInsight作业 - Running HDInsight jobs howto

Thrift服务是否在HDInsight上运行？ - Is the Thrift service running on HDInsight?

如何从在HDInsight上运行的Apache Spark读取Azure表存储数据 - How to read Azure Table Storage data from Apache Spark running on HDInsight

hdinsight动作脚本安装spark 1.2 - hdinsight actionscript install spark 1.2

Azure | 具有ADLS和边缘节点的HDInsight Spark - Azure | HDInsight Spark with ADLS and edge node

HDInsight：如何在Spark作业中使用更多核心 - HDInsight: How to use more cores in Spark job

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从HDInsight群集头节点运行Spark应用程序 - Running spark application from HDInsight cluster headnode 使用IntelliJ IDEA失败提交hdinsight Spark作业 - submit hdinsight Spark job using IntelliJ IDEA failure 使用本地驱动器在Azure HDInsight上运行Spark程序 - Running Spark program on Azure HDInsight using local drive 在Azure HdInsight的Linux群集上的Spark中运行Zeppelin段落时出错 - Error while running Zeppelin paragraphs in Spark on Linux cluster in Azure HdInsight 如何运行HDInsight作业 - Running HDInsight jobs howto Thrift服务是否在HDInsight上运行？ - Is the Thrift service running on HDInsight? 如何从在HDInsight上运行的Apache Spark读取Azure表存储数据 - How to read Azure Table Storage data from Apache Spark running on HDInsight hdinsight动作脚本安装spark 1.2 - hdinsight actionscript install spark 1.2 Azure | 具有ADLS和边缘节点的HDInsight Spark - Azure | HDInsight Spark with ADLS and edge node HDInsight：如何在Spark作业中使用更多核心 - HDInsight: How to use more cores in Spark job

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM