My row key's initial start part looks like "YYYYMMDDhhmmss" where 'ss' is always 00. Example: 20170603162100 , which corresponds to 16:21 on 06th June 2017 (Don't ask me why, but the time-stamp has to be at the start of the key!)
This is obviously every minute (and obviously every minute is unique) data.
This suffers from region hot-spotting. Row keys will be like this on a region server:
My read patterns: Get data for a unique minute (not for a hour, a day, a month, a year)
Say I have 10 region servers.
Here is a solution I am thinking of, which looks like kind of a salt(but is deterministic, and not random):
I see the mm Part - minute and assign a salt based on that. 00 minute: prefix A to row key 01 minute: prefix B to row key .. 09 minute: prefix J to row key 10 minute: prefix A to row key
This way all 'A' keys should distribute to first region server, and so forth. The advantages may be : all single minute requests to the same region server, which is bearable for me. And the very next minute, all requests to some other region server.
Also, when retrieving, i won't have to do parallel reads for I actually know the salt.
Can someone explain if I am somewhere wrong?
Well, you have just 27 minutes covered with english alphabet, probably I would suggest to use two-letters salt, it still should distribute properly. (How many nodes do you have?).
Alternatively, you can try just to remove seconds from your row-key and reverse it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.