简体   繁体   English

使用 Javascript 从 BigQuery 中的上一行和同一列计算

[英]calculate from previous row in BigQuery and same column using Javascript

i need to calculate pending_principal from the previous raw, is there any I can do this directly in SQL?我需要从以前的原始数据中计算pending_principal,有什么我可以直接在SQL 中执行此操作的吗? I'm using BigQuery and Javascript.我正在使用 BigQuery 和 Javascript。

sample Data样本数据

Date日期 Rawnumb生麻 late_fee滞纳金 interest兴趣 Pending_principal Pending_principal
2020-01-01 2020-01-01 1 1 0 0 100000 100000 1000000 1000000
2020-01-02 2020-01-02 2 2 null null 150000 150000 null null
2020-01-03 2020-01-03 3 3 null null 200000 200000 null null
2020-01-04 2020-01-04 4 4 null null 250000 250000 null null
2020-01-05 2020-01-05 1 1 100000 100000 300000 300000 1000000 1000000
2020-01-06 2020-01-06 2 2 null null 900000 900000 null null

i want to calculate pending_principal and late_fee which contains null values我想计算包含 null 值的 pending_principal 和 late_fee

The logic for late_fee if rownumb=1 late fee already exist on table but if rownum is not 1 the logic is:如果 rownumb=1 滞纳金已经存在于表中,late_fee 的逻辑是:

late_fee=5% * previous row pending_principal late_fee=5% * 上一行pending_principal

The logic for pending_principal if rownumb=1 pending_principal already exist on table but if rownum is not 1 the logic is:如果 rownumb=1 pending_principal 已存在于表中,则 pending_principal 的逻辑,但如果 rownum 不为 1,则逻辑为:

pending_principal=previous pending_principal+late_fee+interest pending_principal=上一个pending_principal+late_fee+利息

for example on 2020-01-02例如在 2020-01-02

late_fee=5%*1.000.000=50.000迟到费=5%*1.000.000=50.000

pending_principal=1.000.000+50.000+150.000=1.200.000挂起的_principal=1.000.000+50.000+150.000=1.200.000

i write the query:我写了查询:

CREATE TEMP FUNCTION udf_calc(x ARRAY<STRUCT<rownum INT64, late_fee INT64, interest INT64, pending_principal INT64>>)
RETURNS STRUCT<rownum INT64,late_fee INT64, interest INT64, pending_principal INT64>
LANGUAGE js
AS """
  var vrownum = 0;
  var vlate_fee = 0;
  var vinterest = 0;
  var vpending_principal = 0;
  for (var row of x)
  {
    if (vrownum == 1) {
      vlate_fee=row.late_fee
    }
    else {vlate_fee = parseInt(vpending_principal) * 0.05}
    ;
    if (vrownum === 1) {
      vpending_principal = row.pending_principal;
    }
    else {
      vpending_principal = parseInt(vpending_principal) + parseInt(vlate_fee) + parseInt(row.interest);
    }
    vinterest = row.interest;
    vrownum = row.rownum;
  }
  r = {rownum:vrownum,late_fee:vlate_fee, interest:vinterest, pending_principal:vpending_principal};
  return r;
""";

WITH mytable AS (
  SELECT date '2020-01-01' as date, 1 as rownum , 0 as late_fee, 100000 as interest, 1000000 as pending_principal UNION ALL
  SELECT date '2020-01-02',2 , null, 150000, null UNION ALL
  SELECT date '2020-01-03',3 , null, 200000, null UNION ALL
  SELECT date '2020-01-04',4 ,null, 250000, null UNION ALL
  SELECT date '2020-01-05',1 , 100000 , 300000, 100000 UNION ALL
  SELECT date '2020-01-06',2 ,  null, 900000, null
)
select date,
  udf_calc(array_agg(STRUCT(rownum, late_fee, interest, pending_principal)) over (order by date rows unbounded preceding)).*
from mytable

but the result is not correct但结果不正确

Date日期 rownum行数 late_fee滞纳金 interest兴趣 Pending_principal Pending_principal
2020-01-01 2020-01-01 1 1 0 0 0 0 1000000 1000000
2020-01-02 2020-01-02 2 2 50000 50000 150000 150000 1200000 120万
2020-01-03 2020-01-03 3 3 60000 60000 200000 200000 1460000 1460000
2020-01-04 2020-01-04 4 4 73000 73000 250000 250000 1783000 1783000
2020-01-05 2020-01-05 1 1 89150 89150 300000 300000 2172150 2172150
2020-01-06 2020-01-06 2 2 108608 108608 900000 900000 3180757 3180757

i expect the result is我希望结果是

Date日期 rownum行数 late_fee滞纳金 interest兴趣 Pending_principal Pending_principal
2020-01-01 2020-01-01 1 1 0 0 0 0 1000000 1000000
2020-01-02 2020-01-02 2 2 50000 50000 150000 150000 1200000 120万
2020-01-03 2020-01-03 3 3 60000 60000 200000 200000 1460000 1460000
2020-01-04 2020-01-04 4 4 73000 73000 250000 250000 1783000 1783000
2020-01-05 2020-01-05 1 1 100000 100000 300000 300000 1000000 1000000
2020-01-06 2020-01-06 2 2 50000 50000 900000 900000 1950000 1950000

i think my script didnt read the condition if rownum==1如果 rownum==1,我认为我的脚本没有读取条件

Is that possible in some way?这在某种程度上可能吗?

It is easier to split rows into groups with something like group_num :使用group_num类的东西更容易将行分成组:

CREATE TEMP FUNCTION udf_calc(x ARRAY<STRUCT<late_fee INT64, interest INT64, pending_principal INT64>>)
RETURNS STRUCT<late_fee INT64, interest INT64, pending_principal INT64>
LANGUAGE js
AS """
  var vlate_fee = 0;
  var vinterest = 0;
  var vpending_principal = 0;
  for (var row of x)
  {
    vlate_fee = parseInt(vpending_principal) * 0.05;
    if (vpending_principal === 0) {
      vpending_principal = row.pending_principal;
    }
    else {
      vpending_principal = parseInt(vpending_principal) + parseInt(vlate_fee) + parseInt(row.interest);
    }
    vinterest = row.interest;
  }
  r = {late_fee:vlate_fee, interest:vinterest, pending_principal:vpending_principal};
  return r;
""";

WITH mytable AS (
  SELECT date '2020-01-01' as date, 1 as group_num , 0 as late_fee, 100000 as interest, 1000000 as pending_principal UNION ALL
  SELECT date '2020-01-02',1 , null, 150000, null UNION ALL
  SELECT date '2020-01-03',1 , null, 200000, null UNION ALL
  SELECT date '2020-01-04',1 ,null, 250000, null UNION ALL
  SELECT date '2020-01-05',2 , 100000 , 300000, 1000000 UNION ALL
  SELECT date '2020-01-06',2 ,  null, 900000, null
)
select date,
  udf_calc(array_agg(STRUCT(late_fee, interest, pending_principal)) over (partition by group_num order by date rows unbounded preceding)).*
from mytable

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM