I'm designing an algorithm to count unique users on a set of pages, based on a 60min sliding scale
So it needs to find unique IPs (or tokens) that have hit a particular page and total up those hits within the last 60 mins
I need this to be very fast at scale (mainly to write but reading is a bonus). We could have 10,000s of users per page multiplied by 1000s of pages.
My research is pointing me to using Redis with HyperLogLog
I'm new to Redis coming from a Memcache background. Could anyone give me any pointers?
Thanks
One way of doing this would be to keep an HLL key for each page/set of pages with a minute resolution. For example, if we're tracking 'index.html' and the current timestamp is 0, a visitor with the ID 'abc' can be tracked by:
PFADD index.html:0 abc
Once the minute had passed - ie timestamp 1 for simplicity - a visitor such as 'def' will be added to the next key:
PFADD index.html:1 def
And so forth. To count the number of unique visitors from the last 60 minutes, assuming the current timestamp 100, you'll need to call the PFCOUNT
command and provide it with the names of all of these 60 keys, eg:
PFCOUNT index.html:100 index.html:99 ... index.html:41
Note: if you want "old" counts to be evicted, call EXPIRE
after each call to PFADD
.
You can't get time intervals in a single HyperLogLog
key.
Sorted set could be an option;
ZADD
.ZCOUNT
to get total number of unique users in that time interval. I used small numbers for timestamps for example.127.0.0.1:6379> ZADD activeusers:page:1 1 a1
(integer) 1
127.0.0.1:6379> ZADD activeusers:page:1 1 a2
(integer) 1
127.0.0.1:6379> ZADD activeusers:page:1 3 a5
(integer) 1
127.0.0.1:6379> ZADD activeusers:page:1 116 a7
(integer) 1
127.0.0.1:6379> ZCOUNT activeusers:page:1 60 inf
(integer) 1
127.0.0.1:6379> ZRANGEBYSCORE activeusers:page:1 60 inf
1) "a7"
When you are using ZCOUNT
, you will define MIN
as (current time - (60*60)) and MAX
as inf
, so it will take between (now - 3600 seconds) and (now).
One of the drawbacks for this one is, you need to remove old data from these sets manually via using ZREMRANGEBYSCORE
127.0.0.1:6379> ZREMRANGEBYSCORE activeusers:page:1 -inf 59
(integer) 3
127.0.0.1:6379> ZRANGEBYSCORE activeusers:page:1 -inf inf
1) "a7"
127.0.0.1:6379> ZRANGEBYSCORE activeusers:page:1 -inf inf WITHSCORES
1) "a7"
2) "116"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.