简体   繁体   English

计算 BigQuery 中重复字段中的值

[英]Counting Values in a repeated field in BigQuery

I want to select rows that have more thank k values in a repeated field.我想选择在重复字段中具有更多感谢 k 值的行。 (consider for example selecting user that have more than 3 email addresses) (例如考虑选择具有 3 个以上电子邮件地址的用户)

In Standard SQL I know I can use在标准 SQL 中,我知道我可以使用

SELECT * FROM dataset.users
WHERE array_length(email_address) > 3

But what is the way to do this in BigQuery legacy SQL?但是在 BigQuery 旧版 SQL 中执行此操作的方法是什么?

No need for a subquery;不需要子查询; you should be able to filter with OMIT RECORD IF directly:您应该可以直接使用OMIT RECORD IF进行过滤:

SELECT *
FROM dataset.users
OMIT RECORD IF COUNT(email_address) <= 3;

Do you mind commenting on why you want to use legacy SQL, though?不过,您介意评论一下为什么要使用旧版 SQL 吗? If you encountered a problem with standard SQL I'd like to understand what it was so that we can fix it.如果您遇到标准 SQL 的问题,我想了解它是什么,以便我们可以修复它。 Thanks!谢谢!

Counting Values in a repeated field in BigQuery计算 BigQuery 中重复字段中的值

BigQuery Legacy SQL BigQuery 旧版 SQL

SELECT COUNT(email_address) WITHIN RECORD AS address_count
FROM [dataset.users]

If you want then to count output rows - you can use below如果你想计算输出行 - 你可以在下面使用

SELECT COUNT(1) AS rows_count 
FROM (
  SELECT COUNT(email_address) WITHIN RECORD AS address_count
  FROM [dataset.users]
)
WHERE address_count> 3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM