简体   繁体   English

在专门的查询中设置group_by

[英]Setting group_by in specialized query

I need to perform data smoothing using averaging, with a non-standard group_by variable that is created on-the-fly. 我需要使用动态创建的非标准group_by变量,使用平均来执行数据平滑。 My model consists of two tables: 我的模型包含两个表:

class WthrStn(models.Model):
  name=models.CharField(max_length=64, error_messages=MOD_ERR_MSGS)
  owner_email=models.EmailField('Contact email')
  location_city=models.CharField(max_length=32, blank=True)
  location_state=models.CharField(max_length=32, blank=True)
  ...

class WthrData(models.Model):
  stn=models.ForeignKey(WthrStn)
  date=models.DateField()
  time=models.TimeField()
  temptr_out=models.DecimalField(max_digits=5, decimal_places=2)
  temptr_in=models.DecimalField(max_digits=5, decimal_places=2)

  class Meta:
    ordering = ['-date','-time']
    unique_together = (("date", "time", "stn"),)

The data in WthrData table are entered from an xml file in variable time increments, currently 15 or 30 minutes, but that could vary and change over time. WthrData表中的数据是从xml文件以可变的时间增量输入的,当前为15或30分钟,但时间可能会有所变化。 There are >20000 records in that table. 该表中有> 20000条记录。 I want to provide an option to display the data smoothed to variable time units, eg 30 minutes, 1, 2 or N hours (60, 120, 180, etc minutes) 我想提供一个选项来显示平滑化为可变时间单位的数据,例如30分钟,1、2或N小时(60、120、180等分钟)

I am using SQLIte3 as the DB engine. 我正在使用SQLIte3作为数据库引擎。 I tested the following sql, which proved quite adequate to perform the smoothing in 'bins' of N-minutes duration: 我测试了以下sql,事实证明它足以在N分钟持续时间的“ bins”中进行平滑处理:

select id, date, time, 24*60*julianday(datetime(date || time))/N jsec, avg(temptr_out)
as temptr_out, avg(temptr_in) as temptr_in, avg(barom_mmhg) as barom_mmhg,
avg(wind_mph) as wind_mph, avg(wind_dir) as wind_dir, avg(humid_pct) as humid_pct,
avg(rain_in) as rain_in, avg(rain_rate) as rain_rate,
datetime(avg(julianday(datetime(date || time)))) as avg_date from wthr_wthrdata where
stn_id=19 group by round(jsec,0) order by stn_id,date,time;

Note I create an output variable 'jsec' using the SQLite3 function 'julianday', which returns number of days in the integer part and fraction of day in the decimal part. 注意我使用SQLite3函数'julianday'创建了一个输出变量'jsec',该变量以整数部分返回天数,以小数部分返回天数。 So, multiplying by 24*60 gives me number of minutes. 因此,乘以24 * 60可得到分钟数。 Dividing by N-minute resolution gives me a nice 'group by' variable, compensating for varying time increments of the raw data. 除以N分钟分辨率,可以得到一个很好的“分组依据”变量,可以补偿原始数据随时间的变化。

How can I implement this in Django? 如何在Django中实现呢? I have tried the objects.raw(), but that returns a RawQuerySet, not a QuerySet to the view, so I get error messages from the html template: 我已经尝试过objects.raw(),但是它向视图返回了RawQuerySet而不是QuerySet,所以我从html模板获取了错误消息:

  </p>
    Number of data entries: {{ valid_form|length }}
  </p>

I have tried using a standard Query, with code like this: 我试过使用标准的查询,其代码如下:

wthrdta=WthrData.objects.all()
wthrdta.extra(select={'jsec':'24*60*julianday(datetime(date || time))/{}'.format(n)})
wthrdta.extra(select = {'temptr_out':'avg(temptr_out)',
  'temptr_in':'avg(temptr_in)',
  'barom_mmhg':'avg(barom_mmhg)',
  'wind_mph':'avg(wind_mph)',
  'wind_dir':'avg(wind_dir)',
  'humid_pct':'avg(humid_pct)',
  'rain_in':'avg(rain_in)',
  'rain_sum_in':'sum(rain_in)',
  'rain_rate':'avg(rain_rate)',
  'avg_date':'datetime(avg(julianday(datetime(date || time))))'})

Note that here I use the sql-avg functions instead of using the django aggregate() or annotate(). 请注意,这里我使用sql-avg函数,而不是使用django Aggregate()或annotate()。 This seems to generate correct sql code, but I cant seem to get the group_by set properly to my jsec data that is created at the top. 这似乎生成正确的sql代码,但是我似乎无法正确地将group_by设置为在顶部创建的jsec数据。

Any suggestions for how to approach this? 有关如何处理此问题的任何建议? All I really need is to have the QuerySet.raw() method return a QuerySet, or something that can be converted to a QuerySet instead of RawQuerySet. 我真正需要的只是让QuerySet.raw()方法返回QuerySet,或者可以将其转换为QuerySet而不是RawQuerySet的东西。 I can not find an easy way to do that. 我找不到一种简单的方法。

The answer to this turns out to be really simple, using a hint I found from [ https://gist.github.com/carymrobbins/8477219][1] 使用我从[ https://gist.github.com/carymrobbins/8477219][1]中找到的提示,答案实际上非常简单。

though I modified his code slightly. 尽管我稍微修改了他的代码。 To return a QuerySet from a RawQuerySet, all I did was add to my models.py file, right above the WthrData class definition: 为了从RawQuerySet返回一个QuerySet,我所做的就是添加到WthrData类定义上方的models.py文件中:

class MyManager(models.Manager):
  def raw_as_qs(self, raw_query, params=()):
    """Execute a raw query and return a QuerySet. The first column in the
    result set must be the id field for the model.
    :type raw_query: str | unicode
    :type params: tuple[T] | dict[str | unicode, T]
    :rtype: django.db.models.query.QuerySet
    """
    cursor = connection.cursor()
    try:
      cursor.execute(raw_query, params)
      return self.filter(id__in=(x[0] for x in cursor))
    finally:
      cursor.close()

Then in my class definition for WthrData: 然后在我的WthrData的类定义中:

class WthrData(models.Model):
  objects=MyManager()
  ......

and later in the WthrData class: 然后在WthrData类中:

  def get_smoothWthrData(stn_id,n):
    sqlcode='select id, date, time, 24*60*julianday(datetime(date || time))/%s jsec, avg(temptr_out) as temptr_out, avg(temptr_in) as temptr_in, avg(barom_mmhg) as barom_mmhg, avg(wind_mph) as wind_mph, avg(wind_dir) as wind_dir, avg(humid_pct) as humid_pct, avg(rain_in) as rain_in, avg(rain_rate) as rain_rate, datetime(avg(julianday(datetime(date || time)))) as avg_date from wthr_wthrdata where stn_id=%s group by round(jsec,0) order by stn_id,date,time;'
    return WthrData.objects.raw_as_qs(sqlcode,[n,stn_id]);

This allows me to grab results from the highly populated WthrData table smoothed over time increments, and the results come back as a QuerySet instead of RawQuerySet 这使我可以从随时间推移平滑的高度填充的WthrData表中获取结果,然后结果以QuerySet而不是RawQuerySet的形式返回

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM