简体   繁体   English

在 BigQuery 中搜索 ARRAY of STRUCT 的条件

[英]Condition search for ARRAY of STRUCT in BigQuery

I have a problem in BigQuery related to ARRAY and STRUCT.我在 BigQuery 中遇到与 ARRAY 和 STRUCT 相关的问题。 I have a data structure like this:我有这样的数据结构:

    SELECT 
       1 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name)] AS data
       UNION ALL SELECT 2 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('c' AS name, 'c1' AS last_name)] AS data
       UNION ALL SELECT 3 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('c' AS name, 'c1' AS last_name), STRUCT('d' AS name, 'd1' AS last_name)] AS data
       UNION ALL SELECT 4 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('d' AS name, 'd1' AS last_name)] AS data
       UNION ALL SELECT 5 AS id, [STRUCT('d' AS name, 'd1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name)] AS data

Now I want to pick a rows by a condition on the name.现在我想根据名称的条件选择一行。 For example I want only the rows where name = 'a' AND name = 'b' For the example above only id 1 should be returned as a correct answer.例如,我只想要name = 'a' AND name = 'b'行 对于上面的示例,只有 id 1 应该作为正确答案返回。

If I flatten the array using UNNEST and try to run this query I get empty results:如果我使用UNNEST展平数组并尝试运行此查询,我将得到空结果:

WITH sequences AS
  (SELECT 
   1 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name)] AS data
   UNION ALL SELECT 2 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('c' AS name, 'c1' AS last_name)] AS data
   UNION ALL SELECT 3 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('c' AS name, 'c1' AS last_name), STRUCT('d' AS name, 'd1' AS last_name)] AS data
   UNION ALL SELECT 4 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('d' AS name, 'd1' AS last_name)] AS data
   UNION ALL SELECT 5 AS id, [STRUCT('d' AS name, 'd1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name)] AS data)
     
SELECT sequences.id
        
FROM sequences

WHERE EXISTS (SELECT * FROM UNNEST(data) AS x WHERE x.name = 'a' AND x.name = 'b' )

I have tried to use JOIN to get the correct results but using JOIN gave me double results.我曾尝试使用 JOIN 获得正确的结果,但使用 JOIN 给了我双重结果。 For example this query returned me id's 1,2,3,4例如这个查询返回了我的 id 1,2,3,4

WITH sequences AS
  (SELECT 
   1 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name)] AS data
   UNION ALL SELECT 2 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('c' AS name, 'c1' AS last_name)] AS data
   UNION ALL SELECT 3 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('c' AS name, 'c1' AS last_name), STRUCT('d' AS name, 'd1' AS last_name)] AS data
   UNION ALL SELECT 4 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('d' AS name, 'd1' AS last_name)] AS data
   UNION ALL SELECT 5 AS id, [STRUCT('d' AS name, 'd1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name)] AS data),
     
flat AS (SELECT sequences.id, data.*
        
FROM sequences, UNNEST(data) AS data)

SELECT 
 f1.id
FROM flat f1
JOIN flat f2 USING(id)
WHERE ( f1.name = 'a' AND f2.name = 'b') 

How can I pick the rows where name only equals to 'a' and 'b'如何选择名称仅等于“a”和“b”的行

Assumming names are distinct within the data假设名称在数据中是不同的

#standardSQL
SELECT *
FROM `project.dataset.table` t
WHERE (
  SELECT 
    COUNTIF(name IN ('a', 'b')) = 2
    AND COUNT(name) = 2
  FROM t.data
)

(After writing, I found it not very different from Mikhail Berlyant's answer) (写完之后发现和Mikhail Berlyant的回答差别不大)

CREATE TEMP FUNCTION has_names(data ANY TYPE, names ARRAY<STRING>)
AS ((
    SELECT ARRAY_LENGTH(data) = COUNT(*) FROM UNNEST(data) AS x WHERE x.name IN UNNEST(names) 
));

WITH sequences AS
  (SELECT 
   1 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name)] AS data
   UNION ALL SELECT 2 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('c' AS name, 'c1' AS last_name)] AS data
   UNION ALL SELECT 3 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('c' AS name, 'c1' AS last_name), STRUCT('d' AS name, 'd1' AS last_name)] AS data
   UNION ALL SELECT 4 AS id, [STRUCT('a' AS name, 'a1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name), STRUCT('d' AS name, 'd1' AS last_name)] AS data
   UNION ALL SELECT 5 AS id, [STRUCT('d' AS name, 'd1' AS last_name), STRUCT('b' AS name, 'b1' AS last_name)] AS data)
     
SELECT sequences.id
FROM sequences
WHERE has_names(data, ['a', 'b']);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM