简体繁体中英

HBase row key design for reads and updates

原文 2014-10-11 21:51:34 1 2 hadoop/ hbase/ bloom-filter

I'm try to understand the best way to design the key for my HBase Table.

My use case :

Structure right now

PersonID | BatchDate | PersonJSON

When some thing about the person is modified, a new PersonJSON and new a batchdate is inserted in to Hbase updating the old records. And every 4 hours a scan of all the people who are modified are then pushed to Hadoop for further processing.

If my key is just personID it great for updating the data. But my performance sucks because I have to add a filter on BatchData column to scan all the rows greater than a batch date.

If my key is a composite key like BatchDate|PersonID I could use startrow and endrow on the row key and get all the rows that have been modified. But then I would have lot of duplicated since the key is not unique and can no longer update a person.

Is bloom filter on row+col (personid+batchdate) an option ?

Any help is appreciated. Thanks, Abhishek

2 answers

In addition to the table with PersonID as the rowkey, it sounds like you need a dual-write secondary index , with BatchDate as the rowkey.

Another option would be Apache Phoenix , which provides support for secondary indexes.

I usually do two steps: Create table one just have key is commbine of BatchDate+PersonId, value could be empty. Create table two just as normal you did. Key is PersonId Value is the whole data.

For date range query: query table one first to get the PersonIds, and then use Hbase batch get API to get the data by batch. it would be very fast.

query hbase row-key

Partial Row key scan in HBase

HBase row key range assignment

How to set start and end row key HBASE

How filter Scan of HBase by part of row key?

How to get an HBase row using just they key?

How to get row key for HBase in Java

Efficiently scanning on composite row key in hbase

Sqoop: -hbase-row-key as value

auto generate row_key in hbase

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question query hbase row-key Partial Row key scan in HBase HBase row key range assignment How to set start and end row key HBASE How filter Scan of HBase by part of row key? How to get an HBase row using just they key? How to get row key for HBase in Java Efficiently scanning on composite row key in hbase Sqoop: -hbase-row-key as value auto generate row_key in hbase

Related Tags

HBase row key design for reads and updates

Question

2 answers

solution1
0 2015-01-12 22:49:35

solution2
0 2016-02-06 18:52:43

HBase row key design for reads and updates

Question

2 answers

solution1 0 2015-01-12 22:49:35

solution2 0 2016-02-06 18:52:43

solution1
0 2015-01-12 22:49:35

solution2
0 2016-02-06 18:52:43