简体   繁体   English

Select ID在bigquery中按照一定的百分比分类随机

[英]Select ID randomly according to certain percentage categories in bigquery

I have table:我有表:

id  type_name   product_id  
1   a           abn 
2   b           adj 
3   c           wjek
4   a           jdeks   
5   a           uweye
6   c           qjqk
7   b           wdsk
8   a           jserks  
9   b           uwee
10  c           qek
......

In another source type_name a: 10% type_name b: 60% type_name c: 30%在另一个来源 type_name a: 10% type_name b: 60% type_name c: 30%

I want to choose an id randomly from table, but the selected id must represent the percentage of type_name.我想从表中随机选择一个id,但是选择的id必须代表type_name的百分比。 for example, I want take 20 id.例如,我想取 20 个 id。 So:所以:

  • type_name a 10% x 20 = 2类型名称 a 10% x 20 = 2
  • type_name b 60% x 20 = 12类型名称 b 60% x 20 = 12
  • type_name c 30% x 20 = 6类型名称 c 30% x 20 = 6

Consider below approach考虑以下方法

with splits as (
  select 'a' type_name, 60 percent union all
  select 'b', 10 union all
  select 'c', 30 
)
select t.* from your_table t
join splits using(type_name)
qualify round(20 / 100 * percent) >= row_number() over(partition by type_name order by rand())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM