简体繁体 English

如何控制每个区域服务器用于读取 HBase 表的映射器数量

[英]how to control the number of mappers per region server for reading a HBase table

原文 2016-09-22 17:57:28 1 1 java/ hadoop/ apache-spark/ mapreduce/ hbase

I have a HBase Table(Written through Apache Phoenix) , That needs to be read and write to a Flat Text File.我有一个 HBase 表（通过 Apache Phoenix 编写），需要读取和写入纯文本文件。 Current Bottleneck is as we have 32 salt buckets for that HBase(Phoenix) table it opens only 32 mappers to read.当前的瓶颈是因为我们有 32 个盐桶用于该 HBase(Phoenix) 表，它只打开 32 个映射器来读取。 And when the data grows over 100 Billion it becomes time consuming.当数据增长超过 1000 亿时，它变得非常耗时。 Can someone point me how to control the number of mappers per region server for reading a HBase table?有人可以指出我如何控制每个区域服务器用于读取 HBase 表的映射器数量吗？ I also have seen program that explains in below URL , " https://gist.github.com/bbeaudreault/9788499 " but I does not have a driver program that explains fully.我也看过在下面的 URL 中解释的程序，“ https://gist.github.com/bbeaudreault/9788499 ”，但我没有完整解释的驱动程序。 Can someone help?有人可以帮忙吗？

1 个解决方案

In my observation, number of regions of table = number of mappers opened by framework .在我看来，表的区域数 = framework 打开的映射器数。

so reduce number of regions which will in turn reduce number of mappers.所以减少区域的数量，这反过来又会减少映射器的数量。

How can this be done :如何才能做到这一点：

1) pre-split hbase table while creating for ex 0-9 . 1) 在为 ex 0-9 创建时预先拆分 hbase 表。

2) load all the data with in these regions by generating row prefix between 0-9.* 2) 通过生成 0-9.* 之间的行前缀来加载这些区域中的所有数据

Below are various ways to do Splitting :以下是进行拆分的各种方法：

Also, have a look at apache-hbase-region-splitting-and-merging另外，看看 apache-hbase-region-splitting-and-merging

Moreover, setting number of mappers does not guarantee that it will open those many, it was driven by input splits此外，设置映射器的数量并不能保证它会打开那么多，它是由输入拆分驱动的

You can change number of mappers using setNumMapTasks or conf.set('mapred.map.tasks','numberofmappersyouwanttoset') (but its a suggestion to configuration ).您可以使用setNumMapTasks或conf.set('mapred.map.tasks','numberofmappersyouwanttoset')更改映射器的数量（但这是对配置的建议）。

About link provided by you, I don't know what is this how it works you can check with author.关于您提供的链接，我不知道它是如何工作的，您可以与作者联系。