簡體   English   中英

Hive SQL 中按年、月、位置、狀態的累計總和

[英]Cummulative sum by year_month, location, state in Hive SQL

我想根據其他列在recorrencia列中使用 Hive SQL 進行累積計數。

 +------------+---------+-------+--------------+--+
| t.ano_mes  | t.site  | t.uf  | recorrencia  |
+------------+---------+-------+--------------+--+
| 202001     | 174     | AM    | 1            |
| 202002     | 174     | AM    | 1            |
| 202003     | 174     | AM    | 1            |
| 202004     | 174     | AM    | 1            |
| 202005     | 174     | AM    | 1            |
| 202006     | 174     | AM    | 1            |
| 202007     | 174     | AM    | 1            |
| 202008     | 174     | AM    | 1            |
| 202005     | 1JN     | SP    | 1            |
| 202006     | 1JN     | SP    | 1            |
| 202005     | 1LJ     | SP    | 1            |
| 202009     | 1LJ     | SP    | 1            |
| 202001     | 1RG     | SP    | 1            |
| 202002     | 1RG     | SP    | 1            |
| 202003     | 1RG     | SP    | 1            |
| 202004     | 1RG     | SP    | 1            |
| 202005     | 1RG     | SP    | 1            |
| 202006     | 1RG     | SP    | 1            |
| 202007     | 1RG     | SP    | 1            |

期望輸出

+------------+---------+-------+--------------+--------+
| t.ano_mes  | t.site  | t.uf  | recorrencia  |cum_rec
+------------+---------+-------+--------------+--------+
| 202001     | 174     | AM    | 1            |1
| 202002     | 174     | AM    | 1            |2
| 202003     | 174     | AM    | 1            |3
| 202004     | 174     | AM    | 1            |4
| 202005     | 174     | AM    | 1            |5
| 202006     | 174     | AM    | 1            |6
| 202007     | 174     | AM    | 1            |7
| 202008     | 174     | AM    | 1            |8
| 202005     | 1JN     | SP    | 1            |1
| 202006     | 1JN     | SP    | 1            |2
| 202005     | 1LJ     | SP    | 1            |1
| 202009     | 1LJ     | SP    | 1            |2
| 202001     | 1RG     | SP    | 1            |1
| 202002     | 1RG     | SP    | 1            |2
| 202003     | 1RG     | SP    | 1            |3
| 202004     | 1RG     | SP    | 1            |4
| 202005     | 1RG     | SP    | 1            |5
| 202006     | 1RG     | SP    | 1            |6
| 202007     | 1RG     | SP    | 1            |7

我已經嘗試了很多函數,如COUNT(*) OVER (t.ano_mes)COUNT(*) OVER (t.site)但它運行總和直到表結束,並且不作為t.site重新啟動變化。

一旦t.site更改,計數器應重新啟動。

那將是:

sum(recorrencia) over(partition by t.site order by t.ano_mes) as cum_rec

partition by子句會在每次站點更改時重置總和。

請注意,如果recorrencia始終為1 ,如您的示例數據所示,則row_number()就足夠了:

row_number() over(partition by t.site order by t.ano_mes) as cum_rec

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM