简体   繁体   English

在CROSS JOIN UNNEST之后,BigQuery标准SQL计算原始行

[英]BigQuery Standard SQL count original rows after CROSS JOIN UNNEST

I have a table with a repeated field that requires a CROSS JOIN UNNEST and I want to be able to get the count of the original, nested rows. 我有一个重复字段的表,需要一个CROSS JOIN UNNEST ,我希望能够得到原始嵌套行的计数。 For example. 例如。

SELECT studentId, COUNT(1) as studentCount
FROM myTable
CROSS JOIN UNNEST classes
WHERE classes.id in ('1', '2')

Right now, if a student is in class 1 and 2 it will count that student twice in studentCount . 现在,如果学生在1级和2级,它将在studentCount计算该学生两次。

I know I can do count(distinct(student.id)) to workaround this, but this ends up being a lot slower than a simple count. 我知道我可以做count(distinct(student.id))来解决这个问题,但这最终比简单计数要慢很多。 It's not taking advantage of the fact there's exactly one row per student. 它没有利用每个学生只有一行的事实。

So is there any way to compute count of the original rows before unnesting (but after the where clause) but still include the unnest in the query? 那么有什么方法可以在取消之前计算原始行的计数(但是在where子句之后)但是仍然在查询中包含了这个行?

Note this must be in Standard SQL. 请注意,这必须在标准SQL中。

I understood your "challenge" as to show only students from classes id 1 and 2 while still showing total count of student in all classes. 我理解你的“挑战”只显示来自1号和2号班级的学生,同时仍然显示所有班级的学生总数。 If this is it - see below 如果是这样 - 见下文

#standardSQL
SELECT studentId, studentCount
FROM myTable
CROSS JOIN (SELECT COUNT(1) studentCount FROM myTable)
WHERE studentId IN (
  SELECT studentID FROM UNNEST(classes) AS classes
  WHERE classes.id IN ('1', '2')
)

you can test / play with it using dummy data as below 您可以使用虚拟数据进行测试/播放,如下所示

#standardSQL
WITH myTable AS (
  SELECT 1 AS studentId, [STRUCT<id STRING>('1'),STRUCT('2'),STRUCT('3')] AS classes UNION ALL
  SELECT 2, [STRUCT<id STRING>('4'),STRUCT('5')]
)
SELECT studentId, studentCount
FROM myTable
CROSS JOIN (SELECT COUNT(1) studentCount FROM myTable)
WHERE studentId IN (
  SELECT studentID FROM UNNEST(classes) AS classes
  WHERE classes.id IN ('1', '2')
)  

If your desired output is different from what I guessed - you still might find above useful for calculating studentCount 如果你想要的输出与我猜测的不同 - 你仍然会发现上面有用于计算studentCount

Just given the original constraints--that unnesting is required and you need to count the number of students--you can use a query of this form: 只是给出了原始约束 - 需要取消需要并且您需要计算学生数量 - 您可以使用此表单的查询:

SELECT studentId, (SELECT COUNT(*) FROM myTable) AS studentCount
FROM myTable
CROSS JOIN UNNEST classes
WHERE classes.id in ('1', '2')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM