简体   繁体   中英

Mapreduce on hbase

I am executing one map reduce job which is processing 30 rows from a hbase table(MAP_INPUT_RECORDS=30).This table has 11000 regions but at any time one record will be in a single region only as per our region split policy(ie single record will not be in 2 or more region). Here i am getting more number of mappers 65 in the log (TOTAL_LAUNCHED_MAPS=65). As per the hbase document, for each region one mapper will get assigned. But in my case the number of mappers are more than the region. suggest some solution. Thanks in advance.

You have 11000 regions(Table regions) so at max you can have 11000 mappers.

Are you confusing table regions with region servers of Hbase. A Hbase can have 10 region servers and a table hosted on the hbase can have 1000 regions. Each region server hosting 100 regions.

TableInputFormat spawns mapper based on regions of the table and not Hbase Region Server.

For a better understanding please follow http://bytepadding.com/big-data/hbase/hbase-parameter-tuning/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM