简体   繁体   中英

Spark High Availability

I`m using spark 1.2.1 on three nodes that run three workers with slave configuration and run daily jobs by using:

./spark-1.2.1/sbin/start-all.sh

//crontab configuration:
./spark-1.2.1/bin/spark-submit --master spark://11.11.11.11:7077 --driver-class-path home/ubuntu/spark-cassandra-connector-java-assembly-1.2.1-FAT.jar --class "$class" "$jar"

I want to keep spark master and slave workers available at all times, and even if it fail I need it to be restarted like a service (like cassandra does).

Is there any way to do it?

EDIT:

I looked into start-all.sh script and it is only contains the setup for start-master.sh script and start-slaves.sh script. I tried to create a supervisor configuration file for it and only get the below errors:

11.11.11.11: ssh: connect to host 11.11.11.12 port 22: No route to host
11.11.11.13: org.apache.spark.deploy.worker.Worker running as process 14627. Stop it first.
11.11.11.11: ssh: connect to host 11.11.11.12 port 22: No route to host
11.11.11.12: ssh: connect to host 11.11.11.13 port 22: No route to host
11.11.11.11: org.apache.spark.deploy.worker.Worker running as process 14627. Stop it first.
11.11.11.12: ssh: connect to host 11.11.11.12 port 22: No route to host
11.11.11.13: ssh: connect to host 11.11.11.13 port 22: No route to host
11.11.11.11: org.apache.spark.deploy.worker.Worker running as process 14627. Stop it first.

有像 monit 和 supervisor(甚至 systemd)这样的工具可以监控和重启失败的进程。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM