简体   繁体   中英

BigQuery query doesn't work with UNNEST()

I'm trying to search on StackOverflow data through BigQuery by letting this query match a string pattern on answers and by filtering relevant question answers by tags.

WITH question_answers_join AS (
  SELECT *
  FROM (
    SELECT id, creation_date, title
      , (SELECT AS STRUCT body b
         FROM `bigquery-public-data.stackoverflow.posts_answers` 
         WHERE a.id=parent_id
      ) answers
      , SPLIT(tags, '|') tags
    FROM `bigquery-public-data.stackoverflow.posts_questions` a
  )
)SELECT * 
FROM question_answers_join
WHERE 'google-bigquery' IN UNNEST(tags)
AND REGEXP_CONTAINS(answers.b, r"hello")
ORDER BY RAND()
LIMIT 100

however, I get this error:

Scalar subquery produced more than one element

what is it referring to? How can I fix this?

Below is for BigQuery Standard SQL

#standardSQL
WITH question_answers_join AS (
  SELECT *
  FROM (
    SELECT id, creation_date, title
      , ARRAY(SELECT body                             /* this line was the reason for error */
         FROM `bigquery-public-data.stackoverflow.posts_answers` 
         WHERE a.id=parent_id
      ) answers
      , SPLIT(tags, '|') tags
    FROM `bigquery-public-data.stackoverflow.posts_questions` a
  )
)
SELECT *
FROM question_answers_join 
WHERE 'google-bigquery' IN UNNEST(tags)
AND EXISTS (
  SELECT 1 
  FROM UNNEST(answers) answer 
  WHERE REGEXP_CONTAINS(answer, r"hello")
)
ORDER BY RAND()
LIMIT 100    

I think, it is easy to just compare above with your original query to see the differences (hint: there are just two of them). First difference is the actual reason for the error you saw. Second difference is to reflect changes introduced by first one

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM