[英]MySQL calculate moving average of N rows
我试图计算单个查询N行的移动平均值,对所有行。 在示例案例中,我试图计算 50 行的移动平均值。
SELECT
h1.date,
h1.security_id,
( SELECT
AVG(last50.close)
FROM (
SELECT h.close
FROM history as h
WHERE h.date <= h1.date AND h.security_id = h1.security_id
ORDER BY h.date DESC
LIMIT 50
) as last50
) as avg50
FROM history as h1
但是,MySQL 在运行此查询时给了我一个错误:
Unknown column 'h1.date' in 'where clause'
我正在尝试这种方法,因为列出的其他解决方案似乎并不适合我的用例。 有 N天移动平均值的解决方案,但由于我的数据集中没有考虑所有日期,我需要 N行的平均值。
此解决方案(如下所示)不起作用,因为AVG
(还有SUM
和COUNT
)不考虑LIMIT
:
SELECT
t1.data_date
( SELECT SUM(t2.price) / COUNT(t2.price)
FROM t as t2
WHERE t2.data_date <= t1.data_date
ORDER BY t2.data_date DESC
LIMIT 5
) AS 'five_row_moving_average_price'
FROM t AS t1
ORDER BY t1.data_date;
这个问题看起来很有希望,但对我来说有点难以理解。
有什么建议? 这是一个可以玩的 SQLFiddle 。
计划
- 过去 50 天的自加入历史记录
- 按日期和安全 ID(当前的)进行平均分组
询问
select curr.date, curr.security_id, avg(prev.close)
from history curr
inner join history prev
on prev.`date` between date_sub(curr.`date`, interval 49 day) and curr.`date`
and curr.security_id = prev.security_id
group by 1, 2
order by 2, 1
;
输出
+---------------------------+-------------+--------------------+
| date | security_id | avg(prev.close) |
+---------------------------+-------------+--------------------+
| January, 04 2016 00:00:00 | 1 | 10.770000457763672 |
| January, 05 2016 00:00:00 | 1 | 10.800000190734863 |
| January, 06 2016 00:00:00 | 1 | 10.673333485921225 |
| January, 07 2016 00:00:00 | 1 | 10.59250020980835 |
| January, 08 2016 00:00:00 | 1 | 10.432000160217285 |
| January, 11 2016 00:00:00 | 1 | 10.40166680018107 |
| January, 12 2016 00:00:00 | 1 | 10.344285828726631 |
| January, 13 2016 00:00:00 | 1 | 10.297500133514404 |
| January, 14 2016 00:00:00 | 1 | 10.2877779006958 |
| January, 04 2016 00:00:00 | 2 | 56.15999984741211 |
| January, 05 2016 00:00:00 | 2 | 56.18499946594238 |
| .. | .. | .. |
+---------------------------+-------------+--------------------+
参考
修改为使用最后 50 行
select
rnk_curr.`date`, rnk_curr.security_id, avg(rnk_prev50.close)
from
(
select `date`, security_id,
@row_num := if(@lag = security_id, @row_num + 1,
if(@lag := security_id, 1, 1)) as row_num
from history
cross join ( select @row_num := 1, @lag := null ) params
order by security_id, `date`
) rnk_curr
inner join
(
select date, security_id, close,
@row_num := if(@lag = security_id, @row_num + 1,
if(@lag := security_id, 1, 1)) as row_num
from history
cross join ( select @row_num := 1, @lag := null ) params
order by security_id, `date`
) rnk_prev50
on rnk_curr.security_id = rnk_prev50.security_id
and rnk_prev50.row_num between rnk_curr.row_num - 49 and rnk_curr.row_num
group by 1,2
order by 2,1
;
笔记
if 函数强制执行正确的变量求值顺序。
在 mysql 8 窗口函数框架中可用于获取平均值。
SELECT date, security_id, AVG(close) OVER (PARTITION BY security_id ORDER BY date ROWS 49 PRECEDING) as ma
FROM history
ORDER BY date DESC
这将计算当前行和前 49 行的平均值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.