简体   繁体   English

如何从SparkContext将Apache Spark与Yarn连接起来?

[英]How to connect Apache Spark with Yarn from the SparkContext?

I have developed a Spark application in Java using Eclipse. 我已经使用Eclipse在Java中开发了一个Spark应用程序。
So far, I am using the standalone mode by configuring the master's address to 'local[*]'. 到目前为止,我通过将主机的地址配置为“ local [*]”来使用独立模式。
Now I want to deploy this application on a Yarn cluster. 现在,我想将此应用程序部署在Yarn群集上。
The only official documentation I found is http://spark.apache.org/docs/latest/running-on-yarn.html 我找到的唯一官方文档是http://spark.apache.org/docs/latest/running-on-yarn.html

Unlike the documentation for deploying on a mesos cluster or in standalone ( http://spark.apache.org/docs/latest/running-on-mesos.html ), there is not any URL to use within SparkContext for the master's adress. 与用于在mesos群集上或独立部署的文档不同( http://spark.apache.org/docs/latest/running-on-mesos.html),SparkContext中没有任何URL可以用作主服务器的地址。
Apparently, I have to use line commands to deploy spark on Yarn. 显然,我必须使用命令行命令在Yarn上部署spark。

Do you know if there is a way to configure the master's adress in the SparkContext like the standalone and mesos mode? 您是否知道是否可以像独立模式和Mesos模式一样在SparkContext中配置主控地址?

There actually is a URL. 实际上有一个URL。

Ensure that HADOOP_CONF_DIR or YARN_CONF_DIR points to the directory which contains the (client side) configuration files for the Hadoop cluster. 确保HADOOP_CONF_DIRYARN_CONF_DIR指向包含Hadoop集群的(客户端)配置文件的目录。 These configs are used to write to HDFS and connect to the YARN ResourceManager 这些配置用于写入HDFS并连接到YARN ResourceManager

You should have at least hdfs-site.xml , yarn-site.xml , and core-site.xml files that specify all the settings and URLs for the Hadoop cluster you connect to. 您至少应具有hdfs-site.xmlyarn-site.xmlcore-site.xml文件,这些文件指定要连接到的Hadoop集群的所有设置和URL。

Some properties from yarn-site.xml include yarn.nodemanager.hostname and yarn.nodemanager.address . yarn-site.xml某些属性包括yarn.nodemanager.hostnameyarn.nodemanager.address

Since the address has a default of ${yarn.nodemanager.hostname}:0 , you may only need to set the hostname. 由于该地址的默认值为${yarn.nodemanager.hostname}:0 ,因此您可能只需要设置主机名。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM