简体繁体中英

hbase rowkey design

原文 2013-03-02 18:42:03 4 2 nosql/ hbase

I am moving from mysql to hbase due to increasing data.

I am designing rowkey for efficient access pattern.

I want to achieve 3 goals.

Get all results of email address
Get all results of email address + item_type
Get all results of particular email address + item_id

I have 4 attributes to choose from

user email
reverse timestamp
item_type
item_id

What should my rowkey look like to get rows efficiently?

Thanks

2 answers

Assuming your main access is by email you can have your main table key as email + reverse time + item_id (assuming item_id gives you uniqueness)

You can have an additional "index" table with email+item_type+reverse time+item_id and email+item_id as keys that maps to the first table (so retrieving by these is a two step process)

Maybe you are already headed in the right direction as far as concatenated row keys: in any case following comes to mind from your post:

Partitioning key likely consists of your reverse timestamp plus the most frequently queried natural key - would that be the email? Let us suppose so: then choose to make the prefix based on which of the two (reverse timestamp vs email) provides most balanced / non-skewed distribution of your data. That makes your region servers happier.

Choose based on better balanced distribution of records: reverse timestamp plus most frequently queried natural key eg reversetimestamp-email or email-reversetimestamp

In that manner you will avoid hot spotting on your region servers. .

To obtain good performance on the additional (secondary ) indexes, that is not "baked into" hbase yet: they have a design doc for it (look under SecondaryIndexing in the wiki).

But you can build your own a couple of ways:

a) use coprocessor to write the item_type as rowkey to separate tabole with a column containing the original (user_email-reverse timestamp (or vice-versa) fact table rowke

b) if disk space not issue and/or the rows are small, just go ahead and duplicate the entire row in the second (and third for the item-id case) tables.

Best rowkey design for hbase

HBase RowKey for Hierarchical data

HBase - rowkey basics

Hbase performance rowkey vs column qualifiers

HBase Design Row Key

HBase Row Key Design

HBase: Regarding schema design

HBase schema design example

Hbase schema design suggestion

Hbase Schema design

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Best rowkey design for hbase HBase RowKey for Hierarchical data HBase - rowkey basics Hbase performance rowkey vs column qualifiers HBase Design Row Key HBase Row Key Design HBase: Regarding schema design HBase schema design example Hbase schema design suggestion Hbase Schema design

Related Tags

hbase rowkey design

Question

2 answers

solution1
1 2013-03-04 04:59:42

solution2
0 2013-03-02 23:42:17

hbase rowkey design

Question

2 answers

solution1 1 2013-03-04 04:59:42

solution2 0 2013-03-02 23:42:17

solution1
1 2013-03-04 04:59:42

solution2
0 2013-03-02 23:42:17