计算 BigQuery 中重复字段中的值

Question

I want to select rows that have more thank k values in a repeated field.我想选择在重复字段中具有更多感谢 k 值的行。 (consider for example selecting user that have more than 3 email addresses) （例如考虑选择具有 3 个以上电子邮件地址的用户）

In Standard SQL I know I can use在标准 SQL 中，我知道我可以使用

SELECT * FROM dataset.users
WHERE array_length(email_address) > 3

But what is the way to do this in BigQuery legacy SQL?但是在 BigQuery 旧版 SQL 中执行此操作的方法是什么？

Answer 1

No need for a subquery;不需要子查询； you should be able to filter with OMIT RECORD IF directly:您应该可以直接使用OMIT RECORD IF进行过滤：

SELECT *
FROM dataset.users
OMIT RECORD IF COUNT(email_address) <= 3;

Do you mind commenting on why you want to use legacy SQL, though?不过，您介意评论一下为什么要使用旧版 SQL 吗？ If you encountered a problem with standard SQL I'd like to understand what it was so that we can fix it.如果您遇到标准 SQL 的问题，我想了解它是什么，以便我们可以修复它。 Thanks!谢谢！

Answer 2

Counting Values in a repeated field in BigQuery计算 BigQuery 中重复字段中的值

BigQuery Legacy SQL BigQuery 旧版 SQL

SELECT COUNT(email_address) WITHIN RECORD AS address_count
FROM [dataset.users]

If you want then to count output rows - you can use below如果你想计算输出行 - 你可以在下面使用

SELECT COUNT(1) AS rows_count 
FROM (
  SELECT COUNT(email_address) WITHIN RECORD AS address_count
  FROM [dataset.users]
)
WHERE address_count> 3

计算 BigQuery 中重复字段中的值

问题描述

2 个解决方案

解决方案1
7 已采纳 2016-09-19 16:19:47

解决方案2
0 2016-09-19 13:28:17

计算 BigQuery 中重复字段中的值

问题描述

2 个解决方案

解决方案1 7 已采纳 2016-09-19 16:19:47

解决方案2 0 2016-09-19 13:28:17

解决方案1
7 已采纳 2016-09-19 16:19:47

解决方案2
0 2016-09-19 13:28:17