简体繁体中英

How does HBase mapreduce job communicate with server? (newbie question)

原文 2011-06-27 15:43:50 9 3 java/ hadoop/ hbase

I am new to Hadoop and HBase and even though I've read allot, I still don't understand the basic hierarchy and workflow of map reduce job API.

By what I understand, I will need to use the java API to implement certain classes and pass them to hbase which will coordinate the splitting and distribution process. Is that correct?

If so, how does the application communicate with the server to pass the relevant code for the map reduce job? I have a missing link here....

Thanks

3 answers

When you run your HBase MapReduce job, your classpath has to contain both the HBase and MapReduce configuration files. The configuration files will contain settings such as the location of the JobTracker, the HDFS NameNode, and the HBase master node. The runtime will then automatically pick up all these settings from the configuration files so that your job knows which servers to contact.

I think you should just work through the basic tutorial , which should make things clear. I found the quickest way to get started was by playing with the Cloudera VM .

Also, I'm not sure about your reference to HBase; you should be passing Java classes to Hadoop, not HBase.

However, in an attempt to answer you question, Hadoop should be installed on all nodes in your cluster. The Hadoop framework will take care of farming the map and reduce tasks out to nodes.

The standard way to execute a M/R job using HBase is the same way you execute a non-HBase m/r job: ${HADOOP_HOME}/bin/hadoop jar .jar [args]

This copies your jar to all of the task trackers (via HDFS) so that they can execute your code.

With HBase you also will typically use the HBase utility: TableMapReduceUtil.initTableReducerJob

This uses built-in algorithms to split an HBase table (using the regions of the table) so that computation can be distributed over the m/r jobs. If you want a different split, you have to modify the way splits are calculated, which means that you cannot use the built-in utility.

The other thing you can specify is conditions on the rows that are returned. If you use a built-in scan condition, then you don't have to do anything special. However, if you want to create a custom comparator, you have to make sure that the region servers have this code in their classpath so that they can execute it. Before you go this route, examine the built-in comparators carefully, as they are quite powerful.

Hbase mapreduce job: all column values are null

HBase bulk delete using MapReduce job

How can I limit the scan of HBase to only relevant (Unfiltered) regions for the MapReduce job

jboss application server newbie question

HBase MapReduce

How to run mapreduce on hbase scanner result with TableMapReduceUtil

How to import a CSV into HBASE table using MapReduce

How does a chat app communicate with it's server?

How does HTTP frontend server communicate to Websphere

How to give output one mapreduce job as input of another mapreduce job?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Hbase mapreduce job: all column values are null HBase bulk delete using MapReduce job How can I limit the scan of HBase to only relevant (Unfiltered) regions for the MapReduce job jboss application server newbie question HBase MapReduce How to run mapreduce on hbase scanner result with TableMapReduceUtil How to import a CSV into HBASE table using MapReduce How does a chat app communicate with it's server? How does HTTP frontend server communicate to Websphere How to give output one mapreduce job as input of another mapreduce job?

Related Tags

How does HBase mapreduce job communicate with server? (newbie question)

Question

3 answers

solution1
3 2011-12-23 01:57:33

solution2
0 ACCPTED 2011-06-27 15:52:24

solution3
0 2011-06-28 17:20:02

How does HBase mapreduce job communicate with server? (newbie question)

Question

3 answers

solution1 3 2011-12-23 01:57:33

solution2 0 ACCPTED 2011-06-27 15:52:24

solution3 0 2011-06-28 17:20:02

solution1
3 2011-12-23 01:57:33

solution2
0 ACCPTED 2011-06-27 15:52:24

solution3
0 2011-06-28 17:20:02