简体   繁体   中英

HBase row key range assignment

As I'm designing a row key for my HBase table, I have two questions to ask

  1. How are the row key ranges are assigned across HBase regions?
  2. Do the row insertions affect the row key assignment?

(consider we have only two regions)

To elaborate the question,

  1. If I am inserting row keys starting with axx , bxx ,..., zxx does the HBase Master asssign ranges as am in to one region and nz to another region ?

  2. In another case If I'm inserting rowkeys starting only with axx and bxx , does it assign axx to region one and bxx to the other?

Splitting does not occur in HBase until existing regions fill up. So if you set up an HBase cluster with 2 region servers, all data will only be added to one region initially. When that region fills up, data will be split across two regions based on whatever key is in the middle of the full region.

For your question 1. , all keys would be added to one region initially. Assuming an even spread of keys, you should expect to see something close to am in one and nz in another, after the first split occurs.

To show this graphically, assume our two regions can only store four rows each. After entering four records, you'd see:

REGION 1   REGION 2
+-----+    +-----+
| axx |    |     |
| bxx |    |     |
| cxx |    |     |
| dxx |    |     |
+-----+    +-----+

Now if we want to add axy , it won't fit in REGION 1 and so splitting occurs across the middle of the region:

REGION 1   REGION 2
+-----+    +-----+
| axx |    | cxx |
| bxx |    | dxx |
|     |    |     |
|     |    |     |
+-----+    +-----+

and finally our new record is added:

REGION 1   REGION 2
+-----+    +-----+
| axx |    | cxx |
| axy |    | dxx |
| bxx |    |     |
|     |    |     |
+-----+    +-----+

PRE-SPLITTING

If you know your likely key distribution in advance and wish to avoid expensive automatic splits, you can pre-split when you create the table:

create 'animals', 'a', {SPLITS => ['e','m','r']}

This would create four regions, each containing data between 0-e , em , mr , rz .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM