简体   繁体   中英

Filter elements from array in bigquery

I have a bigquery table with the following structure:

select ["apple", "of", "the", "tree"] as array_col, 1 as label
union all (select ["boy", "of", "the", "streets"] as array_col, 2 as label);

I would like, via a query, to obtain a table without certain elements in the arrays. For instance, I want to filter the elements of the array_col array that are either of or the , obtaining the following table:

select ["apple", "tree"] as array_col, 1 as label
union all (select ["boy", "streets"] as array_col, 2 as label);

Is there an easy way to do this in biquery?

Thanks!

From the docs:

with arrays as (
    select ["apple", "of", "the", "tree"] as array_col, 1 as label
    union all (select ["boy", "of", "the", "streets"] as array_col, 2 as label)
)
select
  array(
      select x
      from unnest(array_col) AS x
      where x not in ('of', 'the')
  ) as array_filter,
  label
from arrays;

You can filter it with REGEXP . It may help to filter multiple array columns or huge tables

WITH arrays as (
    SELECT ["apple", "of", "the", "tree"] array_col, 1  label
    UNION ALL (SELECT ["boy", "of", "the", "streets"] array_col, 2 label)
)
SELECT JSON_VALUE_ARRAY(REGEXP_REPLACE(TO_JSON_STRING(array_col), r'("of",)|("the",)',''), '$') array_col, label  
FROM arrays

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM