简体   繁体   中英

Django culmulative sum of HyperLogLog (HLL) Postgres field

I'm using the HyperLogLog (hll) field to represent unique users, using the Django django-pg-hll package. What I'd like to do is get a cumulative total of unique users over a specific time period, but I'm having trouble doing this.

Given a model like:

class DailyUsers(model.Model):
  date = models.DateField()
  users = HllField()

I can get the cummulative HllField for each day like so:

queryset = models.DailyUsers.objects.annotate(
  cumulative_hll_users=Window(
    UnionAgg("users"), order_by=F('date').asc()
  )
)

However, when I try and get the cardinality (the actual number) like so:

queryset = queryset.annotate(
  cumsum=Cardinality("cumulative_hll_users")
)

The following error occurs:

django.db.utils.ProgrammingError: OVER specified, but hll_cardinality is not a window function nor an aggregate function
LINE 1: SELECT "app_dailyusers"."date", hll_cardinality...

Which is odd because Cardinality is defined as an aggregate function. I'm not sure if there's a way around this, I imagine it might be possible to do this in raw sql, but I haven't made much progress.

A solution in either the Django ORM or raw SQL would be greatly appreciated.

This bug occurs because the django-pg-hll pacakage uses the hll_cardinality function instead of the # operator for window functions. Moving to a raw sql solution fixed the issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM