简体   繁体   中英

Only one spark-submit allowed to run in spark-yarn cluster environment

I set up a spark-yarn cluster environment, the Spark(2.2.0) is in Windows 7; yarn cluster is hadoop 2.7.3.

I run "spark-shell" to use SparkSQL:

spark-shell --master yarn --deploy-mode client --conf spark.yarn.archive=hdfs://hadoop_273_namenode_ip:namenode_port/spark-archive.zip

Everything is OK by now, but when I start another "spark-shell", the message below seems never ends output to the console:

17/10/17 17:33:53 INFO Client: Application report for application_1508232101640_0003 (state: ACCEPTED) 

The application status in the ResourceManager web UI shows

[application status] ACCEPTED: waiting for AM container to be allocated, launched and register with RM

If I close the first the "spark-shell", the second one get well to work.

It seems that it does not allow multiple spark-shell(spark-submit) at the same time (in my environment).

How to break the limitation?

waiting for AM container to be allocated

It's a resouce limitation, so you could make your first jb consume less resources.

What happens is that the first job consumes all available resources, and by the time the second job comes around, nothing is been free'd, thus the second job has to wait for resources to become available.

That's why, when you close the first shell, the other one will launch.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM