简体   繁体   中英

How to calculate average for every month from start from year in Presto's SQL (Athena)?

Below is an example of the table data I have

| date       | value |
| 2020-01-01 |  20   |
| 2020-01-14 |  10   |
| 2020-02-02 |  30   |
| 2020-02-11 |  25   |
| 2020-02-25 |  25   |
| 2020-03-13 |  34   |
| 2020-03-21 |  10   |
| 2020-04-06 |  55   |
| 2020-04-07 |  11   |

I would like to generate a result set as below

| date       | value |  average                       |
| 2020-01-01 |  20   |  Jan average                   |
| 2020-01-14 |  10   |  Jan average                   |   
| 2020-02-02 |  30   |  Jan & Feb average             |
| 2020-02-11 |  25   |  Jan & Feb average             |
| 2020-02-25 |  25   |  Jan & Feb average             |
| 2020-03-13 |  34   |  Jan & Feb & Mar average       |
| 2020-03-21 |  10   |  Jan & Feb & Mar average       |
| 2020-04-06 |  55   |  Jan & Feb & Mar & Apr average |
| 2020-04-07 |  11   |  Jan & Feb & Mar & Apr average |

I tried to use window function OVER() and PARTITION() but I managed to get average on month by month rather than starting from the year.

Any suggestions, please.

Thanks

Not sure I understand your question, but if all you want is a running average for each row bound by year:

SELECT date, value, (
  SELECT AVG(value)
  FROM data ds
  WHERE ds.date <= d.date AND YEAR(ds.date) = YEAR(d.date)
) average
FROM data d
ORDER BY d.date ASC;

Example with MySQL (the syntax for this specific query is the same)

If you want to include later rows of the same month in the average, use WHERE MONTH(ds.date) <= MONTH(d.date) .

This following query should give your expected output-

Demo Here

SELECT A.*,
(
    SELECT AVG(Value * 1.00) 
    FROM your_table B 
    WHERE YEAR(B.Date) = YEAR(A.DAte) 
    AND MONTH(B.Date) <= MONTH(A.DAte)
)
FROM your_table A

This query will make your output per year. But if you wants no partition by YEAR, just remove the YEAR filter from the sub query.

This following query will return AVG with no consideration of YEAR, just AVG of all before months-

Demo Here

SELECT A.*,
(
    SELECT AVG(Value * 1.00) 
    FROM your_table B 
    WHERE B.date <= 
    (
        SELECT MAX(Date) 
        FROM your_table C
        WHERE YEAR(c.Date) = YEAR(A.Date)
        AND MONTH(C.Date) = MONTH(A.Date)
    )
)
FROM your_table A

I think you want:

select 
    t.*,
    avg(value) over(
        partition by year(date)
        order by month(date)
    ) running_avg
from mytable t

This puts each year in a separate partition, and then orders partition rows by month.

SELECT a.date,
    a.value,
     (Select avg(b.value) from myTable B where b.date < a.date and YEAR(a.date) = YEAR(b.date)) 
From myTable a

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM