[英]calculate from previous row in BigQuery and same column using Javascript
i need to calculate pending_principal from the previous raw, is there any I can do this directly in SQL?我需要从以前的原始数据中计算pending_principal,有什么我可以直接在SQL 中执行此操作的吗? I'm using BigQuery and Javascript.
我正在使用 BigQuery 和 Javascript。
sample Data样本数据
Date![]() |
Rawnumb![]() |
late_fee![]() |
interest![]() |
Pending_principal ![]() |
---|---|---|---|---|
2020-01-01 ![]() |
1 ![]() |
0 ![]() |
100000 ![]() |
1000000 ![]() |
2020-01-02 ![]() |
2 ![]() |
null ![]() |
150000 ![]() |
null ![]() |
2020-01-03 ![]() |
3 ![]() |
null ![]() |
200000 ![]() |
null ![]() |
2020-01-04 ![]() |
4 ![]() |
null ![]() |
250000 ![]() |
null ![]() |
2020-01-05 ![]() |
1 ![]() |
100000 ![]() |
300000 ![]() |
1000000 ![]() |
2020-01-06 ![]() |
2 ![]() |
null ![]() |
900000 ![]() |
null ![]() |
i want to calculate pending_principal and late_fee which contains null values我想计算包含 null 值的 pending_principal 和 late_fee
The logic for late_fee if rownumb=1 late fee already exist on table but if rownum is not 1 the logic is:如果 rownumb=1 滞纳金已经存在于表中,late_fee 的逻辑是:
late_fee=5% * previous row pending_principal
late_fee=5% * 上一行pending_principal
The logic for pending_principal if rownumb=1 pending_principal already exist on table but if rownum is not 1 the logic is:如果 rownumb=1 pending_principal 已存在于表中,则 pending_principal 的逻辑,但如果 rownum 不为 1,则逻辑为:
pending_principal=previous pending_principal+late_fee+interest
pending_principal=上一个pending_principal+late_fee+利息
for example on 2020-01-02例如在 2020-01-02
late_fee=5%*1.000.000=50.000
迟到费=5%*1.000.000=50.000
pending_principal=1.000.000+50.000+150.000=1.200.000
挂起的_principal=1.000.000+50.000+150.000=1.200.000
i write the query:我写了查询:
CREATE TEMP FUNCTION udf_calc(x ARRAY<STRUCT<rownum INT64, late_fee INT64, interest INT64, pending_principal INT64>>)
RETURNS STRUCT<rownum INT64,late_fee INT64, interest INT64, pending_principal INT64>
LANGUAGE js
AS """
var vrownum = 0;
var vlate_fee = 0;
var vinterest = 0;
var vpending_principal = 0;
for (var row of x)
{
if (vrownum == 1) {
vlate_fee=row.late_fee
}
else {vlate_fee = parseInt(vpending_principal) * 0.05}
;
if (vrownum === 1) {
vpending_principal = row.pending_principal;
}
else {
vpending_principal = parseInt(vpending_principal) + parseInt(vlate_fee) + parseInt(row.interest);
}
vinterest = row.interest;
vrownum = row.rownum;
}
r = {rownum:vrownum,late_fee:vlate_fee, interest:vinterest, pending_principal:vpending_principal};
return r;
""";
WITH mytable AS (
SELECT date '2020-01-01' as date, 1 as rownum , 0 as late_fee, 100000 as interest, 1000000 as pending_principal UNION ALL
SELECT date '2020-01-02',2 , null, 150000, null UNION ALL
SELECT date '2020-01-03',3 , null, 200000, null UNION ALL
SELECT date '2020-01-04',4 ,null, 250000, null UNION ALL
SELECT date '2020-01-05',1 , 100000 , 300000, 100000 UNION ALL
SELECT date '2020-01-06',2 , null, 900000, null
)
select date,
udf_calc(array_agg(STRUCT(rownum, late_fee, interest, pending_principal)) over (order by date rows unbounded preceding)).*
from mytable
but the result is not correct但结果不正确
Date![]() |
rownum![]() |
late_fee![]() |
interest![]() |
Pending_principal ![]() |
---|---|---|---|---|
2020-01-01 ![]() |
1 ![]() |
0 ![]() |
0 ![]() |
1000000 ![]() |
2020-01-02 ![]() |
2 ![]() |
50000 ![]() |
150000 ![]() |
1200000 ![]() |
2020-01-03 ![]() |
3 ![]() |
60000 ![]() |
200000 ![]() |
1460000 ![]() |
2020-01-04 ![]() |
4 ![]() |
73000 ![]() |
250000 ![]() |
1783000 ![]() |
2020-01-05 ![]() |
1 ![]() |
89150 ![]() |
300000 ![]() |
2172150 ![]() |
2020-01-06 ![]() |
2 ![]() |
108608 ![]() |
900000 ![]() |
3180757 ![]() |
i expect the result is我希望结果是
Date![]() |
rownum![]() |
late_fee![]() |
interest![]() |
Pending_principal ![]() |
---|---|---|---|---|
2020-01-01 ![]() |
1 ![]() |
0 ![]() |
0 ![]() |
1000000 ![]() |
2020-01-02 ![]() |
2 ![]() |
50000 ![]() |
150000 ![]() |
1200000 ![]() |
2020-01-03 ![]() |
3 ![]() |
60000 ![]() |
200000 ![]() |
1460000 ![]() |
2020-01-04 ![]() |
4 ![]() |
73000 ![]() |
250000 ![]() |
1783000 ![]() |
2020-01-05 ![]() |
1 ![]() |
100000 ![]() |
300000 ![]() |
1000000 ![]() |
2020-01-06 ![]() |
2 ![]() |
50000 ![]() |
900000 ![]() |
1950000 ![]() |
i think my script didnt read the condition if rownum==1如果 rownum==1,我认为我的脚本没有读取条件
Is that possible in some way?这在某种程度上可能吗?
It is easier to split rows into groups with something like group_num
:使用
group_num
类的东西更容易将行分成组:
CREATE TEMP FUNCTION udf_calc(x ARRAY<STRUCT<late_fee INT64, interest INT64, pending_principal INT64>>)
RETURNS STRUCT<late_fee INT64, interest INT64, pending_principal INT64>
LANGUAGE js
AS """
var vlate_fee = 0;
var vinterest = 0;
var vpending_principal = 0;
for (var row of x)
{
vlate_fee = parseInt(vpending_principal) * 0.05;
if (vpending_principal === 0) {
vpending_principal = row.pending_principal;
}
else {
vpending_principal = parseInt(vpending_principal) + parseInt(vlate_fee) + parseInt(row.interest);
}
vinterest = row.interest;
}
r = {late_fee:vlate_fee, interest:vinterest, pending_principal:vpending_principal};
return r;
""";
WITH mytable AS (
SELECT date '2020-01-01' as date, 1 as group_num , 0 as late_fee, 100000 as interest, 1000000 as pending_principal UNION ALL
SELECT date '2020-01-02',1 , null, 150000, null UNION ALL
SELECT date '2020-01-03',1 , null, 200000, null UNION ALL
SELECT date '2020-01-04',1 ,null, 250000, null UNION ALL
SELECT date '2020-01-05',2 , 100000 , 300000, 1000000 UNION ALL
SELECT date '2020-01-06',2 , null, 900000, null
)
select date,
udf_calc(array_agg(STRUCT(late_fee, interest, pending_principal)) over (partition by group_num order by date rows unbounded preceding)).*
from mytable
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.