简体   繁体   English

弹簧套应用中的纱线配置

[英]yarn configuration in a spring boot application

I am attempting to write a yarn application using spring boot. 我正在尝试使用Spring Boot编写yarn应用程序。 Just to make it clear I am not using the spring yarn functionality. 为了清楚起见,我没有使用弹簧纱功能。 Instead, I am using plain spring boot stuff to work with yarn. 取而代之的是,我使用普通的弹簧靴套材料来处理纱线。 For some reason, when I load the new YarnConfiguration() object in a spring boot application it only loads the core-site.xml and yarn-site.xml and not the mapred, hdfs and all the default-xml equivalents. 由于某种原因,当我在春季启动应用程序中加载new YarnConfiguration()对象时,它仅加载core-site.xml和yarn-site.xml,而不加载mapred,hdfs和所有默认的xml等效项。 If I dont use spring boot then all xml files are loaded. 如果我不使用spring boot,那么所有xml文件都会被加载。 The problem with not loading the xml files is that then the application is not able to connect to Resource Manager. 不加载xml文件的问题在于,应用程序无法连接到资源管理器。 I am assuming somehow this is being caused by some changes in the classpath that spring boot causes but I am not sure exactly how to work around them. 我假设这是由Spring Boot引起的类路径中的某些更改引起的,但是我不确定究竟如何解决它们。

Here is my configuration 这是我的配置

@Configuration
@EnableConfigurationProperties
@EnableAutoConfiguration
@ComponentScan
public class Application implements CommandLineRunner {

    @Bean
    public org.apache.hadoop.conf.Configuration conf() throws IOException {
      YarnConfiguration conf = new YarnConfiguration();
      log.info("conf " + conf.toString());
      log.info("fs " + FileSystem.get(conf));
      return new YarnConfiguration();
    }

the log output shows that only 2 xml files are loaded in configuration and therefore, the fs loaded in next line is LocalFileSystem not HDFS. 日志输出显示配置中仅加载了2个xml文件,因此,在下一行中加载的fs是LocalFileSystem而不是HDFS。

Any ideas ... 有任何想法吗 ...

There are a couple of possible issues here: 这里有两个可能的问题:

Regarding local file system instead of hdfs : YarnConfiguration should load core-site.xml, your core-site.xml should have something like: 关于本地文件系统而不是hdfs :YarnConfiguration应该加载core-site.xml,您的core-site.xml应该具有以下内容:

  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://NAMENODE:8020</value>
  </property>

In addition this core-site.xml should be in the classpath of your application, please note that hadoop jars also have a default / empty core-site.xml, so you have to make sure yours has precedence. 此外,此core-site.xml应该位于应用程序的类路径中,请注意,Hadoop罐子也有一个默认的/空的core-site.xml,因此必须确保您的优先级。

Regarding Yarn and mapreduce : Yarn is a generic resource management and scheduling framework, mapreduce is just one of the types of applications which can be run on yarn. 关于Yarn和mapreduce :Yarn是一个通用的资源管理和调度框架,mapreduce只是可以在yarn上运行的应用程序类型之一。 This is the reason why YarnConfiguration will not load mapreduce-*.xml, but these files will be loaded by mapreduce code when you try to submit a mapreduce job: 这就是为什么YarnConfiguration不会加载mapreduce-*。xml的原因,但是当您尝试提交mapreduce作业时,这些文件将由mapreduce代码加载:

Configuration configuration = new YarnConfiguration()    
Job job = Job.newInstance(configuration)
job.getConfiguration(); // this configuration should have mapred-*.xml files loaded
job.submit();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM