简体   繁体   English

在对Django模型进行平均时,为什么MYSQL DB返回损坏的值?

[英]Why does MYSQL DB return a corrupted value when averaging over a Django models.DateTimeField?

I'm running a Django application on top of a MySQL (actually MariaDB) database. 我正在MySQL(实际上是MariaDB)数据库之上运行Django应用程序。

My Django Model looks like this: 我的Django模型如下所示:

from django.db import models
from django.db.models import Avg, Max, Min, Count

class myModel(models.Model):
    my_string = models.CharField(max_length=32,)
    my_date = models.DateTimeField()

    @staticmethod
    def get_stats():            
        logger.info(myModel.objects.values('my_string').annotate(
                count=Count("my_string"), 
                min=Min('my_date'), 
                max=Max('my_date'), 
                avg=Avg('my_date'),
            )
        )

When I run get_stats() , I get the following log line: 当我运行get_stats() ,得到以下日志行:

[2015-06-21 09:45:40] INFO [all_logs:96] [{'my_string': u'A', 'count': 2, 'avg': 20080507582679.5, 'min': datetime.datetime(2007, 8, 2, 11, 33, 53, tzinfo=<UTC>), 'max': datetime.datetime(2009, 2, 13, 5, 20, 6, tzinfo=<UTC>)}]

The problem I have with this is that the average of the my_date field returned by the database is: 20080507582679.5 . 我的问题是数据库返回的my_date字段的平均值为: 20080507582679.5 Look carefully at that number. 仔细看那个数字。 It is an invalid date format. 日期格式无效。

Why doesn't the database return a valid value for the average of these two dates? 为什么数据库不返回这两个日期的平均值的有效值? How do I get the actual average of this field if the way described fails? 如果上述方法失败,如何获得该字段的实际平均值? Is Django DateTimeField not setup to do handle averaging? Django DateTimeField是否未设置为进行平均处理?

Q1: Why doesn't the database return a valid value for the average of these two dates? Q1:为什么数据库不返回这两个日期的平均值的有效值?

A: The value returned is expected, it's well defined MySQL behavior. 答:返回的值是预期的,它是明确定义的MySQL行为。

MySQL automatically converts a date or time value to a number if the value is used in a numeric context and vice versa. 的MySQL如果该值在数值上下文 ,反之亦然使用日期或时间值自动地转换为数字

MySQL Reference Manual: https://dev.mysql.com/doc/refman/5.5/en/date-and-time-types.html MySQL参考手册: https : //dev.mysql.com/doc/refman/5.5/en/date-and-time-types.html


In MySQL, the AVG aggregate function operates on numeric values. 在MySQL中, AVG聚合函数对数值起作用。

In MySQL, a DATE or DATETIME expression can be evaluated in a numeric context. 在MySQL中,可以在数字上下文中评估DATEDATETIME表达式。

As a simple demonstration, performing an numeric addition operation on a DATETIME implicitly converts the datetime value into a number. 作为一个简单的演示,在DATETIME上执行数字加法运算会将日期时间值隐式转换为数字。 This query: 该查询:

  SELECT NOW(), NOW()+0

returns a result like: 返回如下结果:

  NOW()                                NOW()+0  
  -------------------  -----------------------
  2015-06-23 17:57:48    20150623175748.000000

Note that the value returned for the expression NOW()+0 is not a DATETIME , it's a number . 请注意,为表达式NOW()+0返回的值不是 DATETIME ,而是一个数字

When you specify a SUM() or AVG() function on a DATETIME expression, that's equivalent to converting the DATETIME into a number, and then summing or averaging the number. DATETIME表达式上指定SUM()AVG()函数时,这等效于将DATETIME转换为数字,然后对数字求和或求平均值。

That is, the return from this expression AVG(mydatetimecol) is equivalent to the return from this expression: AVG(mydatetimecol+0) 也就是说,此表达式AVG(mydatetimecol)的返回值等同于此表达式的返回值: AVG(mydatetimecol+0)

What is being "averaged" is a numeric value. “平均”是一个数值。 And you have observed, the value returned is not a valid datetime; 您已经观察到,返回的值不是有效的日期时间; and even in cases where it happens to look like a valid datetime, it's likely not a value you would consider a true "average". 甚至在看起来像有效日期时间的情况下,也可能不是一个您认为真正的“平均值”的值。


Q2: How do I get the actual average of this field if the way described fails? 问题2:如果所述方法失败,如何获得该字段的实际平均值?

A2: One way to do that is to convert the datetime into a numeric value that can be "accurately" averaged, and then convert that back into a datetime. 解答2:一种方法是将日期时间转换为可以“准确”平均的数值,然后将其转换回日期时间。

For example, you could convert the datetime into a numeric value representing a number of seconds from some fixed point in time, eg 例如,您可以将datetime转换为表示从某个固定时间点开始的秒数的数字值,例如

  TIMESTAMPDIFF(SECOND,'2015-01-01',t.my_date)

You could then "average" those values, to get an average number of seconds from a fixed point in time. 然后,您可以“平均”这些值,以从固定时间点获得平均秒数 (NOTE: beware of adding up an extremely large number of rows, with extremely large values, and exceeding the limit (maximum numeric value), numeric overflow issues.) (注意:谨防增加数量非常大的行,具有非常大的值并且超过限制(最大数值)的数字溢出问题。)

  AVG(TIMESTAMPDIFF(SECOND,'2015-01-01',t.my_date))

To convert that back to a datetime, add that value as a number of seconds back to a the fixed point in time: 要将其转换回日期时间,请将该值作为秒数添加回固定的时间点:

  '2015-01-01' + INTERVAL AVG(TIMESTAMPDIFF(SECOND,'2015-01-01',t.my_date)) SECOND

(Note that the DATEIME values are evaluated in the timezone of the MySQL session; so there are edge cases where the setting of the time_zone variable in the MySQL session will have some influence on the value returned.) (请注意, DATEIME值是在MySQL会话的时区中评估的;因此在DATEIME情况下,MySQL会话中的time_zone变量的设置会对返回的值产生一些影响。)

MySQL also provides a UNIX_TIMESTAMP() function which returns a unix-style integer value, number of seconds from the beginning of the era (midnight Jan. 1, 1970 UTC). MySQL还提供了UNIX_TIMESTAMP()函数,该函数返回unix样式的整数值,即从时代开始(UTC 1970年1月1日午夜)起的秒数。 You can use that to accomplish the same operation more concisely: 您可以使用它来更简洁地完成相同的操作:

  FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(t.my_date)))

Note that this final expression is really doing the same thing... converting the datetime value into a number of seconds since '1970-01-01 00:00:00' UTC, taking a numeric average of that, and then adding that average number of seconds back to '1970-01-01' UTC, and finally converting that back to a DATETIME value, represented in the current session time_zone . 请注意,这个最终表达式实际上在做同样的事情...将datetime值转换为自1970年1月1日00:00:00 UTC以来的秒数,取其平均值,然后将其相加返回“ 1970-01-01” UTC的秒数,最后将其转换回DATETIME值,以当前会话time_zone


Q3: Is Django DateTimeField not setup to do handle averaging? 问题3:是否没有设置Django DateTimeField来处理平均?

A: Apparently, the authors of Django are satisfied with the value returned from the database for a SQL expression AVG(datetime) . 答:显然,Django的作者对SQL表达式AVG(datetime)从数据库返回的值感到满意。

Plan A: Use a TIMESTAMP field instead of a DATETIME field 计划A:使用TIMESTAMP字段代替DATETIME字段

Plan B: Convert DATETIME to TIMESTAMP during the computation: 计划B:在计算期间将DATETIME转换为TIMESTAMP:

FROM_UNIXTIME(ROUND(AVG(UNIX_TIMESTAMP(`my_date`))))

(Sorry, I don't know the Django syntax needed.) (抱歉,我不知道所需的Django语法。)

When you use values() , Django will not convert the value it got from the database-python connector. 当您使用values() ,Django将不会转换从数据库python连接器获取的值。 It's up to the connector to determine how the value is returned. 由连接器决定如何返回值。

In this case, it seems that the MySQL connector returns a string-representation with the separators removed. 在这种情况下,MySQL连接器似乎返回了除去分隔符的字符串表示形式。 You can try to use datetime.strptime() with a matching format to parse it into a datetime object. 您可以尝试使用具有匹配format datetime.strptime()将其解析为datetime对象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM