简体   繁体   English

Apache Spark:工人已连接且免费,但不接受任务

[英]Apache Spark: workers are connected and free, but do not accept tasks

I have a simple Spark cluster - one master, and one slave. 我有一个简单的Spark集群-一个主机,一个从机。 Worker is free and has no busy resources. 工人是自由的,没有忙碌的资源。

Web UI screenshot Web UI屏幕截图

But when I try execute any application (eg 'sc.parallelize(1 to 10).foreach(println)' in spark-shell) I see the following error: 但是,当我尝试执行任何应用程序时(例如spark-shell中的'sc.parallelize(1到10).foreach(println)'),我看到以下错误:

WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

However when application is launched at the same server as the slave, it runs successfully. 但是,当应用程序与从属服务器在同一服务器上启动时,它将成功运行。 Looks like somethings listens a wrong network interface. 看起来好像有人在听错误的网络接口。

The configuration is default, cloned with spark from github . 配置是默认的,从github克隆了spark。

I start master the following way: 我通过以下方式开始掌握:

192.168.111.204@spark > ./sbin/start-master.sh -h 192.168.111.204

slave: 奴隶:

192.168.111.230@spark > ./sbin/start-slave.sh  spark://192.168.111.204:7077 -h 192.168.111.230

application: 应用:

192.168.111.229@spark > ./bin/spark-shell --master spark://192.168.111.204:7077

What should I check? 我应该检查什么?

UPD: just tried the same with two virtual machines. UPD:只是在两个虚拟机上尝试过相同的操作。 Works fine. 工作正常。 Maybe servers have some problems with hostnames. 也许服务器的主机名有一些问题。

A few things you can try: 您可以尝试一些方法:

maybe for some reason the slaves do not have any core allocated, try starting the slave with -c . 也许由于某些原因,从站没有分配任何内核,请尝试使用-c启动从站。

-c CORES, --cores CORES Total CPU cores to allow Spark applications 
    to use on the machine (default: all available); only on worker

Thank to everyone, problem is solved. 谢谢大家,问题解决了。 As I guessed before, networking was a cause of troubles. 如我之前所猜,网络是麻烦的原因。

When spark-shell and spark-submit start, they open a port to listen. 当启动spark-shell和spark-submit时,它们会打开一个端口以进行监听。 However, I didn't find a flag to specify a host for this purpose. 但是,我没有找到用于为此目的指定主机的标志。 So, they started listening on the outer inteface, the port on which was blocked by a firewall. 因此,他们开始侦听外部接口,该接口被防火墙阻止。 I had to add the following lines to conf/spark-env.sh : 我必须在conf / spark-env.sh中添加以下行:

export SPARK_LOCAL_IP=192.168.111.229

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM