简体   繁体   English

使用 Django ORM 每天计算不同时区的事件类型

[英]Count events types per day for different timezones using Django ORM

We have a table that holds multiple events and when they were added.我们有一个包含多个事件以及添加时间的表。 The default timezone used for storing the events is UTC.用于存储事件的默认时区是 UTC。 Eg:例如:

class Events:
    type = models.CharField(max_length=45, null=False)
    date_added = models.DateTimeField(auto_now_add=True)

Now, we want to get a per day count of different event types between two dates - start_date and end_date.现在,我们想要获取两个日期之间不同事件类型的每日计数 - start_date 和 end_date。 Eg: for start_date = "2020-03-1" and end_date = "2020-03-31", output should be -例如:对于 start_date = "2020-03-1" 和 end_date = "2020-03-31",output 应该是 -

[{
    "date" : "2020-03-1",
    "event1" : 200,
    "event2" : 606,
    "event3" : 595
},
{
    "date" : "2020-03-2",
    "event1" : 357,
    "event2" : 71,
    "event3" : 634
},
{
    "date" : "2020-03-3",
    "event1" : 106,
    "event2" : 943,
    "event3" : 315
},
{
    "date" : "2020-03-4",
    "event1" : 187,
    "event2" : 912,
    "event3" : 743
},
.
.
.
.
{
    "date" : "2020-03-31",
    "event1" : 879,
    "event2" : 292,
    "event3" : 438
}]

Since the users are in different timezones (America, Europe, Asia, etc), we want to convert the timezone as per user before counting the events.由于用户位于不同的时区(美国、欧洲、亚洲等),我们希望在计算事件之前根据用户转换时区。 Counting in UTC will have wrong counts per day in the user's timezone.以 UTC 计数在用户的时区中每天都会有错误的计数。 Eg: an event created on 3 March, 1:30 am in IST will be shown on 2 March, 8 pm in UTC and counted accordingly.例如:在 3 月 3 日凌晨 1 点 30 分在 IST 创建的事件将在 3 月 2 日晚上 8 点(UTC 时间)显示并相应计数。

It will get really expensive if we do it using for loop.如果我们使用for循环,它会变得非常昂贵。 Therefore, we want to do it at the DB level using Django ORM.因此,我们希望使用 Django ORM 在 DB 级别执行此操作。 If not possible to completely depend on Django ORM, we want to make it as efficient as possible.如果不可能完全依赖 Django ORM,我们希望使其尽可能高效。

The best query that we could come up with was:我们能想到的最好的查询是:

 Events.objects.filter(
    pk = user_pk, date__range = (
        (end_date - time_delta).strftime("%Y-%m-%d"), 
        end_date.strftime("%Y-%m-%d")
        )
    ).extra({
        "date_added" : "date(date_added)"
    }).values(
        "date_added", 
        "type"
    ).annotate(
        models.Count("type")
    ) 

Where we are getting results like:我们得到的结果如下:

<QuerySet [{'date_added': datetime.date(2020, 3, 6), 'type': 'event1', 'type__count': 30}, 
{'date_added': datetime.date(2020, 3, 6), 'type': 'event2', 'type__count': 189}, 
{'date_added': datetime.date(2020, 3, 6), 'type': 'event3', 'type__count': 1}, 
{'date_added': datetime.date(2020, 3, 6), 'type': 'event4', 'type__count': 3}, 
{'date_added': datetime.date(2020, 3, 9), 'type': 'event2', 'type__count': 57}, 
{'date_added': datetime.date(2020, 3, 9), 'type': 'event1', 'type__count': 23}, 
{'date_added': datetime.date(2020, 3, 9), 'type': 'event4', 'type__count': 1}, 
{'date_added': datetime.date(2020, 3, 10), 'type': 'event1', 'type__count': 5}, 
{'date_added': datetime.date(2020, 3, 10), 'type': 'event2', 'type__count': 21}, 
{'date_added': datetime.date(2020, 3, 11), 'type': 'event2', 'type__count': 9}, 
{'date_added': datetime.date(2020, 3, 11), 'type': 'event1', 'type__count': 15}, 
{'date_added': datetime.date(2020, 3, 12), 'type': 'event2', 'type__count': 49}, 
{'date_added': datetime.date(2020, 3, 13), 'type': 'event2', 'type__count': 8}, 
{'date_added': datetime.date(2020, 3, 13), 'type': 'event1', 'type__count': 3}, 
{'date_added': datetime.date(2020, 3, 17), 'type': 'event1', 'type__count': 16}, 
{'date_added': datetime.date(2020, 3, 17), 'type': 'event2', 'type__count': 26}, 
{'date_added': datetime.date(2020, 3, 17), 'type': 'event4', 'type__count': 1}, 
{'date_added': datetime.date(2020, 3, 17), 'type': 'event3', 'type__count': 1}, 
{'date_added': datetime.date(2020, 3, 18), 'type': 'event2', 'type__count': 64}, 
{'date_added': datetime.date(2020, 3, 18), 'type': 'event1', 'type__count': 11}, 
'...(remaining elements truncated)...']>

This will still require a for loop to get all the events with same date to be added inside one dict, but the timezone issue still persists.这仍然需要一个for循环来将具有相同日期的所有事件添加到一个字典中,但时区问题仍然存在。

How to solve this?如何解决这个问题?

We were able to solve this at last.我们终于能够解决这个问题。 We are still using a for loop to get the data in the required format, but we were able to shift the heavy lifting to the DB.我们仍在使用 for 循环来获取所需格式的数据,但我们能够将繁重的工作转移到数据库。 Here are a few things first:首先是以下几点:

  1. I am using MySQL DB.我正在使用 MySQL DB。 (Check the point below to see why this is relevant.) (检查下面的点,看看为什么这是相关的。)

  2. If you want to convert timezone using the time difference directly ("+05:30" for IST or "+00:00" for UTC), you won't need to run any command.如果您想直接使用时差转换时区(IST 为“+05:30”,UTC 为“+00:00”),则无需运行任何命令。

  3. However, you will need to run a command to support timezones by name in MySQL DB.但是,您需要运行命令以在 MySQL DB 中按名称支持时区。 Eg: If you are going to use named timezones ("ASIA/KOLKATA" or "UTC"), you will need to run one command:例如:如果您要使用命名时区(“ASIA/KOLKATA”或“UTC”),则需要运行一个命令:
    • mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -D mysql -u root -p mysql -D mysql -u 根 -p
  4. This command is for MySQL DB running in Ubuntu.此命令适用于在 Ubuntu 中运行的 MySQL DB。 There are similar有类似的
    commands for other DBs like Postgresql and platforms like Windows.其他数据库(如 Postgresql)和平台(如 Windows)的命令。

Here is the Django ORM query:这是 Django ORM 查询:

events_list = Events.objects.all().extra(
        {
            "date_added" : "date(CONVERT_TZ(date_added, 'UTC', 'America/Chicago'))"
        }
    ).values(
        "date_added",
        "type"
    ).annotate(
        models.Count(
            "type"
        )
    )

This will give data in the format:这将提供以下格式的数据:

<QuerySet [{'date_added': datetime.date(2020, 3, 6), 'type': 'event1', 'type__count': 31}, 
{'date_added': datetime.date(2020, 3, 6), 'type': 'event2', 'type__count': 189}, 
{'date_added': datetime.date(2020, 3, 6), 'type': 'event3', 'type__count': 1}, 
{'date_added': datetime.date(2020, 3, 6), 'type': 'event4', 'type__count': 3}, 
{'date_added': datetime.date(2020, 3, 9), 'type': 'event2', 'type__count': 58}, 
{'date_added': datetime.date(2020, 3, 9), 'type': 'event1', 'type__count': 21}, 
{'date_added': datetime.date(2020, 3, 9), 'type': 'event4', 'type__count': 1}, 
{'date_added': datetime.date(2020, 3, 10), 'type': 'event1', 'type__count': 1}, 
{'date_added': datetime.date(2020, 3, 10), 'type': 'event2', 'type__count': 23}, 
{'date_added': datetime.date(2020, 3, 11), 'type': 'event2', 'type__count': 10}, 
{'date_added': datetime.date(2020, 3, 11), 'type': 'event1', 'type__count': 16}, 
{'date_added': datetime.date(2020, 3, 12), 'type': 'event2', 'type__count': 50}, 
{'date_added': datetime.date(2020, 3, 13), 'type': 'event2', 'type__count': 10}, 
{'date_added': datetime.date(2020, 3, 13), 'type': 'event1', 'type__count': 1}, 
{'date_added': datetime.date(2020, 3, 17), 'type': 'event1', 'type__count': 19}, 
{'date_added': datetime.date(2020, 3, 17), 'type': 'event2', 'type__count': 27}, 
{'date_added': datetime.date(2020, 3, 17), 'type': 'event4', 'type__count': 3}, 
{'date_added': datetime.date(2020, 3, 17), 'type': 'event3', 'type__count': 1}, 
{'date_added': datetime.date(2020, 3, 18), 'type': 'event2', 'type__count': 61}, 
{'date_added': datetime.date(2020, 3, 18), 'type': 'event1', 'type__count': 13}, 
'...(remaining elements truncated)...']>

Here the number of a particular event is counted after the timezones are converted.此处在转换时区后计算特定事件的数量。 Now, once we have the counts for the event after the timezone is converted, all that is remaining is to get this data in the required format which can be easily done with a for loop.现在,一旦我们在时区转换后获得了事件的计数,剩下的就是以所需的格式获取这些数据,这可以通过for循环轻松完成。

PS: PS:

  1. If you have any conditions for the query, you can use filter() instead all.如果您对查询有任何条件,则可以使用 filter() 代替 all。 Eg:例如:

     from django.utils import timezone Events.objects.filter( type__in = ["event1", "event2"], date__gt = ( timezone.now() - timezone.timedelta(days = 30) ) ).extra( { "date_added": "date(CONVERT_TZ(date_added, 'UTC', 'ASIA/KOLKATA'))" } ).values( "date_added", "type" ).annotate( models.Count( "type" ) )

    This will give data for last 30 days for event types 1 and 2.这将为事件类型 1 和 2 提供过去 30 天的数据。

  2. You can use "+00:00" for UTC and "+05:30" for ASIA/KOLKATA.您可以对 UTC 使用“+00:00”,对 ASIA/KOLKATA 使用“+05:30”。 Eg: "date_added": "date(CONVERT_TZ(date_added, '+00:00', '+05:30'))"例如:“date_added”:“date(CONVERT_TZ(date_added, '+00:00', '+05:30'))”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM