简体   繁体   English

如何在 BigQuery 中连接两个表,但连接参数是嵌套的

[英]How to JOIN two table in BigQuery, but the join parameter are nested

sorry for the newbie questions, just started learning SQL. I have two tables:抱歉新手问题,刚开始学习 SQL。我有两个表:

  1. sessions会话
  2. items项目

sessions table has questions (RECORD, Repeated), and inside questions there's item_id (String) sessions表有questions (RECORD,重复), questions里面有item_id (字符串)

items table has topics (RECORD, Repeated), and inside topics there's prior_difficulty (String). items表有topics (RECORD、Repeated), topics内部有prior_difficulty (字符串)。 items table also has item_id (String) items表也有item_id (String)

My objective is to get a list of sessions and its prior_difficulty, by joining the two tables with their item_id .我的目标是通过将两个表与它们的item_id连接起来,获得会话列表及其 prior_difficulty。 TIA TIA

You can first use the unnest() function to retrieve all the item_id s from the sessions table and then join them with the item_id from the items table.您可以先使用unnest() function 从会话表中检索所有item_id ,然后将它们与 items 表中的item_id连接起来。
To retrieve the prior_difficulty from your struct column topics , you can also use the unnest() function:要从您的结构列topics中检索prior_difficulty ,您还可以使用unnest() function:

select distinct
  sessions.session_id,
  t.prior_difficulty
from sessions, unnest(questions) q
left join items on q.item_id = items.item_id, unnest(topics) t

or if you want to create a repeated record column to group prior_difficulty values by session_id :或者,如果您想创建一个重复记录列以按session_idprior_difficulty值进行分组:

select
  sessions.session_id,
  array_agg(distinct t.prior_difficulty ignore nulls) as prior_difficulties
from sessions, unnest(questions) q
left join items on q.item_id = items.item_id, unnest(topics) t
group by 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM