Giraph Job始终以本地模式运行

Question

I ran Giraph 1.1.0 on Hadoop 2.6.0. 我在Hadoop 2.6.0上运行了Giraph 1.1.0。 The mapredsite.xml looks like this mapredsite.xml看起来像这样

<configuration>

<property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
  <description>The runtime framework for executing MapReduce jobs. Can be one of
    local, classic or yarn.</description>
</property>

<property>
<name>mapreduce.map.memory.mb</name>
<value>4096</value>
<name>mapreduce.reduce.memory.mb</name>
<value>8192</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx3072m</value>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx6144m</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>4</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>4</value>
</property>
</configuration>

The giraph-site.xml looks like this giraph-site.xml看起来像这样

<configuration>
<property>
        <name>giraph.SplitMasterWorker</name>
        <value>true</value>
</property>
<property>
        <name>giraph.logLevel</name>
        <value>error</value>
</property>
</configuration>

I do not want to run the job in the local mode. 我不想在本地模式下运行作业。 I have also set environment variable MAPRED_HOME to be HADOOP_HOME. 我还将环境变量MAPRED_HOME设置为HADOOP_HOME。 This is the command to run the program. 这是运行程序的命令。

hadoop jar myjar.jar hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation /user/$USER/inputbc/inputgraph.txt /user/$USER/outputBC 1.0 1

When I run this code that computes betweenness centrality of vertices in a graph, I get the following exception 当我运行此代码以计算图形中顶点的居中性时，出现以下异常

Exception in thread "main" java.lang.IllegalArgumentException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, you cannot run in split master / worker mode since there is only 1 task at a time!
        at org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:168)
        at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:236)
        at hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation.runMain(BetweennessComputation.java:214)
        at hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation.main(BetweennessComputation.java:218)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

What should I do to ensure that the job does not run in local mode? 我应该怎么做才能确保作业不在本地模式下运行？

Answer 1

I have met the problem just a few days ago.Fortunately i solved it by doing this. 几天前我就遇到了这个问题。幸运的是，我做到了。

Modify the configuration file mapred-site.xml,make sure the value of property 'mapreduce.framework.name' to be 'yarn' and add the property 'mapreduce.jobtracker.address' which value is 'yarn' if there is not. 修改配置文件mapred-site.xml，确保属性“ mapreduce.framework.name”的值为“ yarn”，并添加属性“ mapreduce.jobtracker.address”（如果没有，则为“ yarn”）。

The mapred-site.xml looks like this: mapred-site.xml如下所示：

<configuration>
   <property>
     <name>mapreduce.framework.name</name>
     <value>yarn</value>
   </property>
   <property>
     <name>mapreduce.jobtracker.address</name>
     <value>yarn</value>
   </property>
</configuration>

Restart hadoop after modifying the mapred-site.xml.Then run your program and set the value which is after '-w' to be more than 1 and the value of 'giraph.SplitMasterWorker' to be 'true'.It will probably work. 修改mapred-site.xml后重新启动hadoop，然后运行程序并将'-w'之后的值设置为大于1并将'giraph.SplitMasterWorker'的值设置为'true'。。

As for the cause of the problem,I just quote somebody's saying: These properties are designed for single-node executions and will have to be changed when executing things in a cluster of nodes. 至于问题的原因，我只引用有人的话：这些属性是为单节点执行设计的，在节点集群中执行操作时必须更改它们。 In such a situation, the jobtracker has to point to one of the machines that will be executing a NodeManager daemon (a Hadoop slave). 在这种情况下，作业跟踪程序必须指向将要执行NodeManager守护程序（Hadoop从站）的机器之一。 As for the framework, it should be changed to 'yarn'. 至于框架，应将其更改为“ yarn”。

Answer 2

We can see that in the stack-trace where the configuration check in LocalJobRunner fails this is a bit misleading because it makes us assume that we run in local model.You already found the responsible configuration option: giraph.SplitMasterWorker but in your case you set it to true . 我们可以看到在LocalJobRunner配置检查失败的堆栈跟踪中，这有点令人误解，因为它使我们假设我们在本地模型中运行。您已经找到了负责任的配置选项： giraph.SplitMasterWorker但在您的情况下，您进行了设置这是true 。 However, on the command-line with the last parameter 1 you specify to use only a single worker. 但是，在最后一个参数为1的命令行上，您指定仅使用单个工作程序。 Hence the framework decides that you MUST be running in local mode. 因此，框架决定您必须在本地模式下运行。 As a solution you have two options: 作为解决方案，您有两种选择：

Set giraph.SplitMasterWorker to false although you are running on a cluster. 将giraph.SplitMasterWorker设置为false尽管您正在集群上运行。
Increase the number of workers by changing the last parameter to the command-line call. 通过将最后一个参数更改为命令行调用来增加工作程序的数量。
hadoop jar myjar.jar hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation /user/$USER/inputbc/inputgraph.txt /user/$USER/outputBC 1.0 4 hadoop jar myjar.jar hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation /user/$USER/inputbc/inputgraph.txt / user / $ USER / outputBC 1.0 4

Please refer also to my other answer at SO (Apache Giraph master / worker mode) for details on the problem concerning local mode. 另请参阅我在SO（Apache Giraph主/工作模式）上的其他答案，以获取有关本地模式问题的详细信息。

Answer 3

If you are after to split the master from the node you can use: 如果要从节点拆分主服务器，则可以使用：

-ca giraph.SplitMasterWorker=true -ca giraph.SplitMasterWorker = true

also to specify the amount of workers you can use: 还指定您可以使用的工人数量：

-w # -w＃

where "#" is the number of workers you want to use. 其中“＃”是您要使用的工作人员数量。

Giraph Job始终以本地模式运行

问题描述

3 个解决方案

解决方案1
2 2017-05-22 13:55:08

解决方案2
0 2016-09-05 06:35:43

解决方案3
-1 2016-09-03 15:16:11

Giraph Job始终以本地模式运行

问题描述

3 个解决方案

解决方案1 2 2017-05-22 13:55:08

解决方案2 0 2016-09-05 06:35:43

解决方案3 -1 2016-09-03 15:16:11

解决方案1
2 2017-05-22 13:55:08

解决方案2
0 2016-09-05 06:35:43

解决方案3
-1 2016-09-03 15:16:11