如何在SQLAlchemy中使用子查询来生成移动平均线？

Question

My problem is that I want to retrieve both a list of measurements along with a moving average of those measurements. 我的问题是我想要检索测量列表以及这些测量的移动平均值。 I can do that with this SQL statement (postgresql interval syntax): 我可以用这个SQL语句（postgresql interval语法）来做到这一点：

SELECT time, value,                
   (
       SELECT AVG(t2.value)
       FROM measurements t2
       WHERE t2.time BETWEEN t1.time - interval '5 days' AND t1.time
   ) moving_average
FROM measurements t1
ORDER BY t1.time;

I want to have the SQLAlchemy code to produce a similar statement to this effect. 我想让SQLAlchemy代码为此效果生成类似的语句。 I currently have this Python code: 我目前有这个Python代码：

moving_average_days = # configureable value, defaulting to 5
t1 = Measurements.alias('t1')
t2 = Measurements.alias('t2')
query = select([t1.c.time, t1.c.value, select([func.avg(t2.c.value)], t2.c.time.between(t1.c.time - datetime.timedelta(moving_average_days), t1.c.time))],
            t1.c.time > (datetime.datetime.utcnow() - datetime.timedelta(ndays))). \
        order_by(Measurements.c.time)

That however, generates this SQL: 但是，生成此SQL：

SELECT t1.time, t1.value, avg_1
FROM measurements AS t1,
    (
        SELECT avg(t2.value) AS avg_1
        FROM measurements AS t2
        WHERE t2.time BETWEEN t1.time - %(time_1)s AND t1.time
    )
WHERE t1.time > %(time_2)s
ORDER BY t1.time;

That SQL has the subquery as part of the FROM clause where it cannot have scalar access to the column values of the top-level values, ie it causes PostgreSQL to spit out this error: SQL将子查询作为FROM子句的一部分，它不能对顶级值的列值进行标量访问，即它会导致PostgreSQL吐出此错误：

ERROR:  subquery in FROM cannot refer to other relations of same query level
LINE 6:         WHERE t2.time BETWEEN t1.time - interval '5 days' AN...

What I would thus like to know is: how do I get SQLAlchemy to move the subquery to the SELECT clause? 我想知道的是：如何让SQLAlchemy将子查询移动到SELECT子句？

Alternatively another way to get a moving average (without performing a query for each (time,value) pair) would be an option. 或者，另一种获得移动平均线的方法（不对每个（时间，值）对执行查询）将是一种选择。

Answer 1

Right, apparently what I needed was the use of a so-called scalar select . 是的，显然我需要的是使用所谓的标量选择。 With the use of those I get this python code, which actually works as I want it to (generates the equivalent SQL to that of the first in my question which was my goal): 使用那些我得到这个python代码，它实际上按我想要的方式工作（生成与我的问题中第一个相同的SQL，这是我的目标）：

moving_average_days = # configurable value, defaulting to 5
ndays = # configurable value, defaulting to 90
t1 = Measurements.alias('t1') ######
t2 = Measurements.alias('t2')
query = select([t1.c.time, t1.c.value,
                    select([func.avg(t2.c.value)],
                        t2.c.time.between(t1.c.time - datetime.timedelta(moving_average_days), t1.c.time)).label('moving_average')],
            t1.c.time > (datetime.datetime.utcnow() - datetime.timedelta(ndays))). \
        order_by(t1.c.time)

This gives this SQL: 这给了这个SQL：

SELECT t1.time, t1.value,
    (
        SELECT avg(t2.value) AS avg_1
        FROM measurements AS t2 
        WHERE t2.time BETWEEN t1.time - :time_1 AND t1.time
    ) AS moving_average 
FROM measurements AS t1
WHERE t1.time > :time_2 ORDER BY t1.time;

如何在SQLAlchemy中使用子查询来生成移动平均线？

问题描述

1 个解决方案

解决方案1
5 已采纳 2010-09-22 00:56:06

如何在SQLAlchemy中使用子查询来生成移动平均线？

问题描述

1 个解决方案

解决方案1 5 已采纳 2010-09-22 00:56:06

解决方案1
5 已采纳 2010-09-22 00:56:06