簡體   English   中英

在 BigQuery 中,在某些情況下用列中的數字替換 null

[英]In BigQuery, replace null with number in a column under certain circumstances

很難用語言來解釋我們想要完成的事情,但很容易通過例子來解釋。 我們有一個僅在分區內增加的 integer 列,它還包含許多 null 值:

with
  t1 as (
    select 1 as rowNum, null as col1 union all
    select 2 as rowNum, null as col1 union all
    select 3 as rowNum, 1 as col1 union all
    select 4 as rowNum, null as col1 union all
    select 5 as rowNum, null as col1 union all
    select 6 as rowNum, null as col1 union all
    select 7 as rowNum, null as col1 union all
    select 8 as rowNum, null as col1 union all
    select 9 as rowNum, 2 as col1 union all
    select 10 as rowNum, 2 as col1 union all
    select 11 as rowNum, null as col1 union all
    select 12 as rowNum, 2 as col1 union all
    select 13 as rowNum, null as col1 union all
    select 14 as rowNum, null as col1 union all
    select 15 as rowNum, 2 as col1 union all
    select 16 as rowNum, null as col1 union all
    select 17 as rowNum, null as col1 union all
    select 18 as rowNum, null as col1 union all
    select 19 as rowNum, null as col1 union all
    select 20 as rowNum, null as col1 union all
    select 21 as rowNum, null as col1 union all
    select 22 as rowNum, 3 as col1 union all
    select 23 as rowNum, 3 as col1 union all
    select 24 as rowNum, null as col1 union all
    select 25 as rowNum, 3 as col1 union all
    select 26 as rowNum, 3 as col1 union all
    select 27 as rowNum, null as col1 union all
    select 28 as rowNum, null as col1 union all
    select 29 as rowNum, null as col1 union all
    select 30 as rowNum, 4 as col1 union all
    select 31 as rowNum, 4 as col1 union all
    select 32 as rowNum, null as col1 union all
    select 33 as rowNum, null as col1
  )

select * from t1

應保留col1中的大部分null 值,但是如果兩個相同的 integer之間存在 null 值,則應將這些空值替換為該 integer。在上面的示例中,第 11、13 和 14 行中的 null 應為替換為 2,第 24 行中的 null 應替換為 3,因為這些值介於兩個相同的 integer 之間。所有其他 null 值將保持不變。

這個可以通過windows function來解決。part1往后鎖, part1 part2鎖。 如果last_value在兩種情況下都相同,則取該值,否則返回null

with
  t1 as (
    select 1 as rowNum, null as col1 union all
    select 2 as rowNum, null as col1 union all
    select 3 as rowNum, 1 as col1 union all
    select 4 as rowNum, null as col1 union all
    select 5 as rowNum, null as col1 union all
    select 6 as rowNum, null as col1 union all
    select 7 as rowNum, null as col1 union all
    select 8 as rowNum, null as col1 union all
    select 9 as rowNum, 2 as col1 union all
    select 10 as rowNum, 2 as col1 union all
    select 11 as rowNum, null as col1 union all
    select 12 as rowNum, 2 as col1 union all
    select 13 as rowNum, null as col1 union all
    select 14 as rowNum, null as col1 union all
    select 15 as rowNum, 2 as col1 union all
    select 16 as rowNum, null as col1 union all
    select 17 as rowNum, null as col1 union all
    select 18 as rowNum, null as col1 union all
    select 19 as rowNum, null as col1 union all
    select 20 as rowNum, null as col1 union all
    select 21 as rowNum, null as col1 union all
    select 22 as rowNum, 3 as col1 union all
    select 23 as rowNum, 3 as col1 union all
    select 24 as rowNum, null as col1 union all
    select 25 as rowNum, 3 as col1 union all
    select 26 as rowNum, 3 as col1 union all
    select 27 as rowNum, null as col1 union all
    select 28 as rowNum, null as col1 union all
    select 29 as rowNum, null as col1 union all
    select 30 as rowNum, 4 as col1 union all
    select 31 as rowNum, 4 as col1 union all
    select 32 as rowNum, null as col1 union all
    select 33 as rowNum, null as col1
  )

select *,
if(last_value(col1 ignore nulls) over part1=last_value(col1 ignore nulls) over part2,last_value(col1 ignore nulls) over part1,null) as col1_new
 from t1
 window 
 part1 as ( order by rowNum asc rows between unbounded preceding and current row),
 part2 as ( order by rowNum desc rows between unbounded preceding and current row)
 order by 1

還請考慮以下方法

select * except(grp), 
  if(col1 is null and max(col1) over win2 = max(col1) over win3,
    max(col1) over win2, col1
  ) new_col1
from (
  select *, count(*) over win1 - countif(col1 is null ) over win1 as grp
  from t1
  window win1 as (order by rowNum rows between unbounded preceding and 1 preceding)
)
window win2 as (partition by grp), 
win3 as (order by grp range between 1 preceding and 1 preceding)          

如果應用於您問題中的示例數據 - output 是

在此處輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM