簡體   English   中英

BQ - 在沒有連接的情況下從結構數組中獲取一個字段

[英]BQ - getting a field from an array of structs without join

我有一個包含以下列的表:

items ARRAY<STRUCT<label STRING, counter INTEGER>>
explore BOOLEAN

對於每條記錄,我想選擇計數器最高的 label,然后對每個唯一的 label 進行計數explore 。理想情況下,我想運行如下代碼:

SELECT FIRST_VALUE(items.label) OVER (ORDER BY items.counter DESC) as label,
       COUNT(explore) as explore
FROM my_table
GROUP BY 1

如果這是我表中的數據:

explore       items
   1      [['A',1],['B',3]]
   1      [['B',1]]
   0.     [['C',2],['D',1]]

然后我想得到:

label  explore
 'B'      2
 'C'      1

考慮以下方法

select ( select label from t.items
    order by counter desc limit 1
  ) label, 
  count(*) explore
from your_table t
group by label           

如果應用於您問題中的示例數據

with your_table as (
    select 1 explore, [struct('A' as label, 1 as counter), struct('B' as label, 3 as counter) ] items union all 
    select 1, [struct('B', 1)] union all 
    select 0, [struct('C', 2), struct('D', 1) ] 
)

output 是

在此處輸入圖像描述

使用您的示例數據,考慮以下方法。

with data as (
    select 1 as explore, [STRUCT( 'A' as label, 1 as counter), STRUCT( 'B' as label, 3 as counter) ] as items,
    union all select 1 as explore, [STRUCT( 'B' as label, 1 as counter)] as items,
    union all select 0 as explore, [STRUCT( 'C' as label, 2 as counter), STRUCT( 'D' as label, 1 as counter) ] as items
),

add_row_num as (
SELECT 
        explore,
        items,
        row_number() over (order by explore desc) as row_number
FROM data
),

get_highest_label as (
select 
    explore,
    row_number,
    label,
    counter,
    first_value(label) over (partition by row_number order by counter desc) as highest_label_per_row 
from add_row_num, unnest(items)
),

-- https://stackoverflow.com/questions/36675521/delete-duplicate-rows-from-a-bigquery-table (REMOVE DUPLICATE)
remove_dups as (
  SELECT
      *,
      ROW_NUMBER()
          OVER (PARTITION BY row_number) as new_row_number
  FROM get_highest_label
)

select 
    highest_label_per_row,
    count(highest_label_per_row) as explore,
from remove_dups 

where new_row_number = 1
group by highest_label_per_row


Output:

在此處輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM