How to change address 'hadoop jar' command is connecting to?

Question

I have been trying to start a MapReduce job on my cluster with the following command:

bin/hadoop jar myjar.jar MainClass /user/hduser/input /user/hduser/output

But I get the following error over and over again, until connection is refused:

13/08/08 00:37:16 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

I then checked with netstat to see if the service was listening to the correct port:

~> sudo netstat -plten | grep java
tcp        0      0 10.1.1.4:54310          0.0.0.0:*               LISTEN      10022      38365       11366/java
tcp        0      0 10.1.1.4:54311          0.0.0.0:*               LISTEN      10022      32164       11829/java

Now I notice that my service is listening to port 10.1.1.4:54310, which is the IP of my master, but it seems that the 'hadoop jar' command is connecting to 127.0.0.1 (the localhost, which is the same machine) but therefore doesn't find the service. Is there anyway to force 'hadoop jar' to look at 10.1.1.4 instead of 127.0.0.1?

My NameNode, DataNode, JobTracker, TaskTracker, ... are all running. I even checked for DataNode and TaskTracker on the slaves and it all seems to be working. I can check the WebUI on the master and it shows my cluster is online.

I expect the problem to be DNS related since it seems that the 'hadoop jar' command finds the correct port, but always uses the 127.0.0.1 address instead of the 10.1.1.4

UPDATE

Configuration in core-site.xml

<configuration>

<property>
  <name>hadoop.tmp.dir</name>
  <value>/app/hadoop/tmp</value>
  <description>A base for other temporary directories.</description>
</property>

<property>
  <name>fs.default.name</name>
  <value>hdfs://master:54310</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>

</configuration>

Configuration in mapred-site.xml

<configuration>

<property>
  <name>mapred.job.tracker</name>
  <value>master:54311</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>

</configuration>

Configuration in hdfs-site.xml

<configuration>

<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>

</configuration>

Answer 1

Although it seemed to be a DNS issue, it was actually Hadoop trying to resolve a reference to localhost in the code. I was deploying the jar of someone else and assumed it was correct. Upon further inspection I found the reference to localhost and changed it to master, solving my issue.

How to change address 'hadoop jar' command is connecting to?

Question

1 answers

solution1
0 ACCPTED 2013-08-09 08:14:55

How to change address 'hadoop jar' command is connecting to?

Question

1 answers

solution1 0 ACCPTED 2013-08-09 08:14:55

solution1
0 ACCPTED 2013-08-09 08:14:55