简体   繁体   English

HBASE顺序行键(YYYYMMDDHHMMSS),确定性非随机盐

[英]HBASE Sequential row key (YYYYMMDDHHMMSS), Deterministic Non-Random Salt

My row key's initial start part looks like "YYYYMMDDhhmmss" where 'ss' is always 00. Example: 20170603162100 , which corresponds to 16:21 on 06th June 2017 (Don't ask me why, but the time-stamp has to be at the start of the key!) 我的行键的初始起始部分看起来像“ YYYYMMDDhhmmss”,其中“ ss”始终为00。例如:20170603162100,它对应于2017年6月6日的16:21(不要问我为什么,但是时间戳必须在密钥的开始!)

This is obviously every minute (and obviously every minute is unique) data. 显然,这是每分钟(并且显然每分钟都是唯一的)数据。

This suffers from region hot-spotting. 这遭受区域热点的困扰。 Row keys will be like this on a region server: 在区域服务器上,行键将如下所示:

My read patterns: Get data for a unique minute (not for a hour, a day, a month, a year) 我的阅读模式:在唯一的分钟(而不是一个小时,一天,一个月,一年)中获取数据

Say I have 10 region servers. 假设我有10个区域服务器。

Here is a solution I am thinking of, which looks like kind of a salt(but is deterministic, and not random): 这是我正在考虑的一种解决方案,它看起来有点像盐(但是确定性的,不是随机的):

I see the mm Part - minute and assign a salt based on that. 我看到毫米部分-分钟,并根据此分配盐。 00 minute: prefix A to row key 01 minute: prefix B to row key .. 09 minute: prefix J to row key 10 minute: prefix A to row key 00分钟:行键前缀A。01分钟:行键前缀B .. 09分钟:行键前缀J。10分钟:行键前缀A。

This way all 'A' keys should distribute to first region server, and so forth. 这样,所有“ A”键都应分发到第一区域服务器,依此类推。 The advantages may be : all single minute requests to the same region server, which is bearable for me. 优点可能是:所有到同一区域服务器的单分钟请求,这对我来说都是可以承受的。 And the very next minute, all requests to some other region server. 在第二分钟,所有请求都发送到其他区域服务器。

Also, when retrieving, i won't have to do parallel reads for I actually know the salt. 另外,检索时,我不必做并行读取,因为我实际上知道盐。

Can someone explain if I am somewhere wrong? 有人可以解释我在哪里不对吗?

Well, you have just 27 minutes covered with english alphabet, probably I would suggest to use two-letters salt, it still should distribute properly. 好吧,您只有27分钟用英语字母覆盖,可能我会建议使用两字母盐,但仍应正确分配。 (How many nodes do you have?). (您有多少个节点?)。

Alternatively, you can try just to remove seconds from your row-key and reverse it. 或者,您可以尝试仅删除行键中的秒数并反转它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM