简体   繁体   中英

sum of last 12 months of data where each month has 2 rows of data — Spark SQL

so my dataset has two columns, one column for the date (that contains month start and month end date), second column contains the count

1-Nov-18    58
30-Nov-18   76
1-Dec-18    93
31-Dec-18   57
1-Jan-19    62
31-Jan-19   78
1-Feb-19    87
28-Feb-19   75
1-Mar-19    54
31-Mar-19   59
1-Apr-19    76
30-Apr-19   79
1-May-19    81
31-May-19   80
1-Jun-19    71
30-Jun-19   52
1-Jul-19    75
31-Jul-19   57
1-Aug-19    76
31-Aug-19   76
1-Sep-19    63
30-Sep-19   57
1-Oct-19    81
30-Oct-19   84
and so on..

i need my output as

Oct2019 1707

basically for month oct2019 i need sum for counts for records from nov'18 to oct'19 (last 12 months), similary for month nov2019 i would need sum of all records from Dec'18 to Nov'19. and so on for other months

I think what you are looking for is a Group By clause with a select count. I can't give you the perfect answer because I need more information about your schema... What is your Database name, Database Table, Column names...

SELECT `dateField`,SUM(`countField`)
    FROM `TableName`
    GROUP BY `dateField`;

You can also do a Group By Months like this:

GROUP BY YEAR('dateField`), MONTH(`dateField`)

I think you want aggregation and a window function with RANGE :

select date_trunc(date, 'MM') as mon, SUM(cnt) as month_cnt,
       SUM(SUM(cnt)) OVER (ORDER BY MIN(date)
                           RANGE BETWEEN INTERVAL 11 MONTH PRECEDING AND CURRENT ROW
                          ) as running_12_months
from t
group by date_trunc(date, 'MM');

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM