简体   繁体   中英

yarn configuration in a spring boot application

I am attempting to write a yarn application using spring boot. Just to make it clear I am not using the spring yarn functionality. Instead, I am using plain spring boot stuff to work with yarn. For some reason, when I load the new YarnConfiguration() object in a spring boot application it only loads the core-site.xml and yarn-site.xml and not the mapred, hdfs and all the default-xml equivalents. If I dont use spring boot then all xml files are loaded. The problem with not loading the xml files is that then the application is not able to connect to Resource Manager. I am assuming somehow this is being caused by some changes in the classpath that spring boot causes but I am not sure exactly how to work around them.

Here is my configuration

@Configuration
@EnableConfigurationProperties
@EnableAutoConfiguration
@ComponentScan
public class Application implements CommandLineRunner {

    @Bean
    public org.apache.hadoop.conf.Configuration conf() throws IOException {
      YarnConfiguration conf = new YarnConfiguration();
      log.info("conf " + conf.toString());
      log.info("fs " + FileSystem.get(conf));
      return new YarnConfiguration();
    }

the log output shows that only 2 xml files are loaded in configuration and therefore, the fs loaded in next line is LocalFileSystem not HDFS.

Any ideas ...

There are a couple of possible issues here:

Regarding local file system instead of hdfs : YarnConfiguration should load core-site.xml, your core-site.xml should have something like:

  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://NAMENODE:8020</value>
  </property>

In addition this core-site.xml should be in the classpath of your application, please note that hadoop jars also have a default / empty core-site.xml, so you have to make sure yours has precedence.

Regarding Yarn and mapreduce : Yarn is a generic resource management and scheduling framework, mapreduce is just one of the types of applications which can be run on yarn. This is the reason why YarnConfiguration will not load mapreduce-*.xml, but these files will be loaded by mapreduce code when you try to submit a mapreduce job:

Configuration configuration = new YarnConfiguration()    
Job job = Job.newInstance(configuration)
job.getConfiguration(); // this configuration should have mapred-*.xml files loaded
job.submit();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM