简体   繁体   中英

Django - distinct rows/objects distinguished by date/day from datetime field

I'v searched quite a while now and know about several answers on sof but none of the solutions does work at my end even if my problem is pretty simple:

What I need (using postgres + django 1.10): I have many rows with many duplicate dates (=days) within a datetime field. I want a queryset containing one row/object each date/day.

fk | col1 | colX | created (type: datetime)
----------------------------------------------
1  | info | info | 2016-09-03 08:25:52.142617+00:00 <- get it (time does not matter)
1  | info | info | 2016-09-03 16:26:52.142617+00:00
2  | info | info | 2016-09-03 11:25:52.142617+00:00
1  | info | info | 2016-09-14 16:26:52.142617+00:00 <- get it (time does not matter)
3  | info | info | 2016-09-14 11:25:52.142617+00:00
1  | info | info | 2016-09-25 23:25:52.142617+00:00 <- get it (time does not matter)
1  | info | info | 2016-09-25 16:26:52.142617+00:00
1  | info | info | 2016-09-25 11:25:52.142617+00:00
2  | info | info | 2016-09-25 14:27:52.142617+00:00
2  | info | info | 2016-09-25 16:26:52.142617+00:00
3  | info | info | 2016-09-25 11:25:52.142617+00:00
etc.

Whats the best (performance + pythionic/django) way to do this. My model/table is going to have many rows (>million).

EDIT 1

The results must be filtered by a fk (eg WHERE fk = 1) first.

I already tried the most obvious things such as

MyModel.objects.filter(fk=1).order_by('created__date').di‌​stinct('created__dat‌​e') 

but got following error:

django.core.exceptions.FieldError: Cannot resolve keyword 'date' into field. Join on 'created' not permitted.

...same error with all() and respective ordering through class Meta instead of query-method order_by()...

Does somebody maybe know more about this error in this specific case?

I just ran into a similar issue - not with order_by() or distinct() , but with filter() . I am using Django 1.9, but that might not make any difference here.

In one of my apps in one model, filter(datetime_field__date__lt=(date(2016, 12, 5))) works fine, in another model in a different app within the same project, I get the same error as you.

In my case, it looks as if django-money ( https://github.com/django-money/django-money ) causes the problem. As far as I can tell by now, the money_manager() function from djmoney.models.managers breaks the __date lookup ( https://docs.djangoproject.com/en/1.9/ref/models/querysets/#date ).

When I attach another manager not named objects , for example testmanager = models.Manager() , to the relevant model without wrapping it in money_manager() , the __date lookup works fine again, without any other changes to the model or the database.

I have not yet found a fully satisfying solution, but maybe you also use django-money or another third-party application that messes around with the default manager ? Perhaps the traceback gives any hints about which package might be the problem.

My traceback looks like this: Traceback (most recent call last): File "<input>", line 1, in <module> File "/[...]/python3.4/site-packages/django/db/models/manager.py", line 122, in manager_method return getattr(self.get_queryset(), name)(*args, **kwargs) File "/[...]/python3.4/site-packages/djmoney/models/managers.py", line 164, in wrapper args, kwargs = _expand_money_kwargs(model, args, kwargs, exclusions) File "/[...]/python3.4/site-packages/djmoney/models/managers.py", line 136, in _expand_money_kwargs elif isinstance(_get_field(model, name), MoneyField): File "/[...]/python3.4/site-packages/djmoney/models/managers.py", line 63, in _get_field field = qs.setup_joins(parts, opts, alias)[0] File "/[...]/python3.4/site-packages/django/db/models/sql/query.py", line 1405, in setup_joins names, opts, allow_many, fail_on_missing=True) File "/[...]/python3.4/site-packages/django/db/models/sql/query.py", line 1373, in names_to_path " not permitted." % (names[pos + 1], name)) django.core.exceptions.FieldError: Cannot resolve keyword 'date' into field. Join on 'my_datetime_field' not permitted. Traceback (most recent call last): File "<input>", line 1, in <module> File "/[...]/python3.4/site-packages/django/db/models/manager.py", line 122, in manager_method return getattr(self.get_queryset(), name)(*args, **kwargs) File "/[...]/python3.4/site-packages/djmoney/models/managers.py", line 164, in wrapper args, kwargs = _expand_money_kwargs(model, args, kwargs, exclusions) File "/[...]/python3.4/site-packages/djmoney/models/managers.py", line 136, in _expand_money_kwargs elif isinstance(_get_field(model, name), MoneyField): File "/[...]/python3.4/site-packages/djmoney/models/managers.py", line 63, in _get_field field = qs.setup_joins(parts, opts, alias)[0] File "/[...]/python3.4/site-packages/django/db/models/sql/query.py", line 1405, in setup_joins names, opts, allow_many, fail_on_missing=True) File "/[...]/python3.4/site-packages/django/db/models/sql/query.py", line 1373, in names_to_path " not permitted." % (names[pos + 1], name)) django.core.exceptions.FieldError: Cannot resolve keyword 'date' into field. Join on 'my_datetime_field' not permitted.

It doesn't seem to be possible given the current Django implementation, as this would involve using advanced DB backend functions (like Postgres window functions ).

The closest thing you've got is to use aggregations :

MyModel.objects.annotate(
    created_date=TruncDay('created')
).values('created_date').annotate(id=Min('id'))

This would aggregate over the similar dates, and pick-up the minimal id.

[{'created_date': datetime.date(2017, 3, 16), 'id': 146},
 {'created_date': datetime.date(2017, 3, 28), 'id': 188},
 {'created_date': datetime.date(2017, 3, 24), 'id': 178},
 {'created_date': datetime.date(2017, 3, 23), 'id': 171},
 {'created_date': datetime.date(2017, 3, 22), 'id': 157}] ...

If you need the whole objects, you can chain this with a .values_list() and another query set, which would result in a subquery:

MyModel.objects.filter(
    id__in=MyModel.objects.annotate(
        created_date=TruncDay('created')
    ).values('created_date').annotate(id=Min('id')).values_list(
        'id', flat=True
    )
)

FYI this results in the following query

SELECT
    "myapp_mymodel"."id",
    "myapp_mymodel"."created",
    "myapp_mymodel"."col1",
    "myapp_mymodel"."colX"
FROM "myapp_mymodel"
WHERE "myapp_mymodel"."id" IN (
    SELECT MIN(U0."id") AS "id"
    FROM "myapp_mymodel" U0
    GROUP BY DATE(U0."created")
)

you can use a Queryset to get the results from your table by a distinct on the created value because you are using postgresql.

Maybe a query like this should do the work :

MyModel.objects.all().distinct('created__date')

I refer you too the queryset documentation of django : https://docs.djangoproject.com/fr/1.10/ref/models/querysets/#distinct

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM