简体   繁体   English

如何在 BigQuery 中随机匹配元素

[英]How Do I Randomly Match Elements in BigQuery

I want to randomly match elements of a table by some condition (1:1 and symmetric pairing, so if user x matches with user y, then user y matches with user x)我想通过某种条件随机匹配表的元素(1:1和对称配对,所以如果用户x与用户y匹配,那么用户y与用户x匹配)

WITH tbl AS (
  SELECT 1 AS user, 'A' AS condition
  UNION ALL
  SELECT 2 AS user, 'A' AS condition
  UNION ALL
  SELECT 3 AS user, 'A' AS condition
  UNION ALL
  SELECT 4 AS user, 'B' AS condition
  UNION ALL
  SELECT 5 AS user, 'B' AS condition
  UNION ALL
  SELECT 6 AS user, 'B' AS condition
  UNION ALL
  SELECT 7 AS user, 'B' AS condition
  UNION ALL
  SELECT 8 AS user, 'B' AS condition
  UNION ALL
  SELECT 9 AS user, 'B' AS condition
)
SELECT
  user,
  condition
FROM tbl

How can I generate a query to:如何生成查询以:

  1. randomly match all users in condition B with another user in condition B随机匹配条件 B 中的所有用户与条件 B 中的另一个用户
  2. do the same with condition A, but leave one condition A user unmatched because there are an odd number of users with condition A对条件 A 执行相同操作,但保留一个条件 A 用户不匹配,因为条件 A 的用户数量为奇数
  3. identifies the unmatched user (1, 2, or 3) with a NULL match使用 NULL 匹配标识不匹配的用户(1、2 或 3)

Hypothetical result:假设结果:

user用户 condition (健康)状况 match匹配
1 1 A一个 3 3
2 2 A一个 NULL NULL
3 3 A一个 1 1
4 4 B 7 7
5 5 B 9 9
6 6 B 8 8
7 7 B 4 4
8 8 B 6 6
9 9 B 5 5

Consider below approach考虑以下方法

create temp table temp as 
select _0 as user, condition, _1 as match from (
  select user, condition, div(offset, 2) grp, mod(offset, 2) pos
  from (
    select condition, array_agg(user order by rand()) users
    from your_table
    group by condition
  ), unnest(users) user with offset
)
pivot (any_value(user) for pos in (0,1));

select * from (
  select * from temp union all
  select match, condition, user from temp 
)
where not user is null;            

if applied to sample data in your question - output is如果应用于您问题中的示例数据 - output 是

在此处输入图像描述

As you can see above solution requires scripting to be involved - so below is slightly refactored version that allows you to achive same result with just one "simple" query如您所见,上面的解决方案需要涉及脚本 - 所以下面是稍微重构的版本,它允许您通过一个“简单”查询获得相同的结果

select * from (
  select if(grp < 0, _0, _1) as user, condition, if(grp < 0, _1, _0) as match from (
    select user, condition, grp, mod(offset, 2) pos
    from (
      select condition, array_agg(user order by rand()) users
      from your_table
      group by condition
    ), unnest(users) user with offset, unnest([div(offset + 2, 2), -1 * div(offset + 2, 2)]) grp
  )
  pivot (any_value(user) for pos in (0,1))
)
where not user is null

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM