如何计算BigQuery中列的布尔聚合？

Question

I have a table of events of users, and I want to project those events into a new column with some predicate, and then aggregate the events together per user into a new projection that tells me if a user has ever had the predicate match for them, or if they've never had it match, etc. 我有一个用户事件表，我想将这些事件投影到带有一些谓词的新列中，然后将每个用户的事件聚合到一个新的投影中，告诉我用户是否曾经有过谓词匹配，或者如果他们从来没有匹配，等等

In other languages this is usually called all() and any() , where you pass it a list of boolean values and it will tell you if all of them match, or if at least one matches. 在其他语言中，这通常称为all()和any() ，在其中传递一个布尔值列表，它将告诉您它们是否匹配，或者是否至少匹配一个。 It's equivalent to using a boolean AND on all boolean values (such as in the case with all ) or using a boolean OR on all boolean values (as in any ). 这相当于用一个布尔AND所有布尔值（如与本案all ），或使用布尔OR上的所有布尔值（如any ）。

Does BigQuery have this feature? BigQuery有这个功能吗？ I can sort of approximate it using max and min but it's not ideal. 我可以使用max和min来近似它，但它并不理想。

Example: 例：

select
month(date_time) m,
count(*) as ct,
max(id_is_present),
min(id_is_present),
max(starts_with_one) max_one,
min(starts_with_one) min_one,
from
(
    select
    length(user_id) > 1 id_is_present,
    regexp_match(user_id, r'^1') starts_with_one,
    date_time
    from
    [user_events.2015_02]
)
group by
m

It's exploiting a behavior of max(true, false, false) yielding true , so you could sort of implement any and all by searching through the column for values and then building from there. 它正在利用max(true, false, false)产生true ，因此您可以通过在列中搜索值然后从那里构建来实现any和all 。

Is this the hack I have to rely on or does BigQuery support boolean aggregates? 这是我必须依靠的技巧，还是BigQuery支持布尔聚合？

Answer 1

Yes, BigQuery has such aggregation functions, it uses SQL Standard names for them: 是的，BigQuery有这样的聚合函数，它使用SQL标准名称：

EVERY (will do logical and)
SOME (will do logical or)

Answer 2

In case someone else stumbles across this, standard SQL offers logical_and() and logical_or . 如果其他人偶然发现这种情况，标准SQL提供logical_and()和logical_or 。 So, the code could be written as: 因此，代码可以写成：

select month(date_time) as m, count(*) as ct,
       logical_or(id_is_present),
       logical_and(id_is_present),
       logical_or(starts_with_one) as max_one,
       logical_and(starts_with_one) min_one,
from (select length(user_id) > 1 id_is_present,
             regexp_match(user_id, r'^1') starts_with_one,
             date_time
      from [user_events.2015_02]
      ) u
group by m;

如何计算BigQuery中列的布尔聚合？

问题描述

2 个解决方案

解决方案1
4 已采纳 2015-02-10 18:34:25

解决方案2
3 2017-06-09 19:43:53

如何计算BigQuery中列的布尔聚合？

问题描述

2 个解决方案

解决方案1 4 已采纳 2015-02-10 18:34:25

解决方案2 3 2017-06-09 19:43:53

解决方案1
4 已采纳 2015-02-10 18:34:25

解决方案2
3 2017-06-09 19:43:53