简体   繁体   中英

HBASE Sequential row key (YYYYMMDDHHMMSS), Deterministic Non-Random Salt

My row key's initial start part looks like "YYYYMMDDhhmmss" where 'ss' is always 00. Example: 20170603162100 , which corresponds to 16:21 on 06th June 2017 (Don't ask me why, but the time-stamp has to be at the start of the key!)

This is obviously every minute (and obviously every minute is unique) data.

This suffers from region hot-spotting. Row keys will be like this on a region server:

My read patterns: Get data for a unique minute (not for a hour, a day, a month, a year)

Say I have 10 region servers.

Here is a solution I am thinking of, which looks like kind of a salt(but is deterministic, and not random):

I see the mm Part - minute and assign a salt based on that. 00 minute: prefix A to row key 01 minute: prefix B to row key .. 09 minute: prefix J to row key 10 minute: prefix A to row key

This way all 'A' keys should distribute to first region server, and so forth. The advantages may be : all single minute requests to the same region server, which is bearable for me. And the very next minute, all requests to some other region server.

Also, when retrieving, i won't have to do parallel reads for I actually know the salt.

Can someone explain if I am somewhere wrong?

Well, you have just 27 minutes covered with english alphabet, probably I would suggest to use two-letters salt, it still should distribute properly. (How many nodes do you have?).

Alternatively, you can try just to remove seconds from your row-key and reverse it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM