简体   繁体   English

从简单的Java程序调用Giraph作业

[英]Calling a Giraph job from a simple java program

I am new to Giraph and Hadoop Yarn. 我是Giraph和Hadoop Yarn的新手。 Following Giraph's quick start leads me to run the example job from jar build from source from command line. 遵循Giraph的快速入门,使我从命令行从源代码运行jar构建示例作业。

I want to run the job from simple java program. 我想从简单的Java程序运行作业。 The question is inspired from previous similar MapReduce job question . 这个问题是从以前类似的MapReduce工作问题中得到启发的。 Looking for similar answers with java's dependencies which would be needed for that. 寻找类似的答案与Java的依赖关系,这将是需要的。

I have yarn setup locally - needs have a way to feed job to that from java program. 我在本地设置了yarn-需要一种方法来从Java程序中完成工作。

It is evident: https://giraph.apache.org/apidocs/org/apache/giraph/job/GiraphJob.html that there must have a way for this - but I am finding it hard to find examples for it with Yarn. 显而易见: https : //giraph.apache.org/apidocs/org/apache/giraph/job/GiraphJob.html一定有解决此问题的方法-但我发现使用Yarn很难找到示例。

Found a way to do this from GiraphRunner's source code: 从GiraphRunner的源代码中找到了一种实现此目的的方法:

@Test
public void testPageRank() throws IOException, ClassNotFoundException, InterruptedException {

    GiraphConfiguration giraphConf = new GiraphConfiguration(getConf());
    giraphConf.setWorkerConfiguration(1,1,100);
    GiraphConstants.SPLIT_MASTER_WORKER.set(giraphConf, false);

    giraphConf.setVertexInputFormatClass(JsonLongDoubleFloatDoubleVertexInputFormat.class);
    GiraphFileInputFormat.setVertexInputPath(giraphConf,
                                             new Path("/input/tiny-graph.txt"));
    giraphConf.setVertexOutputFormatClass(IdWithValueTextOutputFormat.class);

    giraphConf.setComputationClass(PageRankComputation.class);

    GiraphJob giraphJob = new GiraphJob(giraphConf, "page-rank");       

    FileOutputFormat.setOutputPath(giraphJob.getInternalJob(),
                                   new Path("/output/page-rank2"));
    giraphJob.run(true);
}

private Configuration getConf() {
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS", "hdfs://localhost:9000");

    conf.set("yarn.resourcemanager.address", "localhost:8032");
    conf.set("yarn.resourcemanager.hostname", "localhost");

    // framework is now "yarn", should be defined like this in mapred-site.xm
    conf.set("mapreduce.framework.name", "yarn");
    return conf;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM