繁体   English   中英

从 BigQuery 表中的数组中获取前 N 个元素

[英]Get first N elements from an array in BigQuery table

我有一个数组列,我想获取它的前N个元素(保持数组数据类型)。 有什么好方法吗? 理想情况下,无需取消嵌套,排名和 array_agg 返回数组。

我也可以这样做(用于获取前 2 个元素):

WITH data AS
(
  SELECT 1001 as id, ['a', 'b', 'c'] as array_1
  UNION ALL
  SELECT 1002 as id, ['d', 'e', 'f', 'g'] as array_1
  UNION ALL
  SELECT 1003 as id, ['h', 'i'] as array_1
)
select *,
       [array_1[SAFE_OFFSET(0)], array_1[SAFE_OFFSET(1)]] as my_result
from data

但显然这不是一个好的解决方案,因为如果某个数组只有 1 个元素,它会失败。

这是一个带有 UDF 的通用解决方案,您可以调用任何数组类型:

CREATE TEMP FUNCTION TopN(arr ANY TYPE, n INT64) AS (
  ARRAY(SELECT x FROM UNNEST(arr) AS x WITH OFFSET off WHERE off < n ORDER BY off)
);

WITH data AS
(
  SELECT 1001 as id, ['a', 'b', 'c'] as array_1
  UNION ALL
  SELECT 1002 as id, ['d', 'e', 'f', 'g'] as array_1
  UNION ALL
  SELECT 1003 as id, ['h', 'i'] as array_1
)
select *, TopN(array_1, 2) AS my_result
from data

它使用 unnest 和数组 function,听起来您不想使用,但它的优点是足够通用,您可以将任何数组传递给它。

BigQuery 标准 SQL 的另一个选项(使用 JS UDF)

#standardSQL
CREATE TEMP FUNCTION FirstN(arr ARRAY<STRING>, N FLOAT64)
RETURNS ARRAY<STRING> LANGUAGE js AS """ 
  return arr.slice(0, N);
""";
SELECT *, 
  FirstN(array_1, 3) AS my_result
FROM data   

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM