简体   繁体   English

在 BigQuery 中循环遍历列并将 null 值替换为零的编程方式?

[英]Programmatic way to loop over columns and replace null values with zeros in BigQuery?

I am trying to prepare a large data.table in BigQuery for a regression that involves lots of "dummy" (aka categorical) variables.我正在尝试在 BigQuery 中准备一个大的 data.table 用于涉及大量“虚拟”(又名分类)变量的回归。

One of final steps in this process requires me to effectively replace all instances of null values in the table with zeros.此过程的最后一个步骤要求我有效地将表中 null 值的所有实例替换为零。

Is there a clean and programmatic way to do this in Big Query?在 Big Query 中是否有一种干净的编程方式来执行此操作? For example, in the table below, I'd ideally like to loop over all the "country_*" fields, and replace with zero in a non hard coded fashion.例如,在下表中,理想情况下,我希望遍历所有“country_*”字段,并以非硬编码方式替换为零。 I have an inkling that this may be a job for dynamic SQL, but I get pretty lost swimming in the documentation.我有一个暗示,这可能是动态 SQL 的工作,但我在文档中迷路了。 Any help would be greatly appreciated!任何帮助将不胜感激!

TLDR: This is an example of the data structure I'm facing. TLDR:这是我面临的数据结构的一个例子。

country国家 country_1国家_1 country_2国家_2 country_3国家_3 other covariates其他协变量
1 1个 1 1个 - - - -
2 2个 - - 1 1个 - -
3 3个 - - - - 1 1个

This is what I'd like to have这就是我想要的

country国家 country_1国家_1 country_2国家_2 country_3国家_3 other covariates其他协变量
1 1个 1 1个 0 0 0 0
2 2个 0 0 1 1个 0 0
3 3个 0 0 0 0 1 1个

Simpleton method:傻瓜法:

select country, 
       ifnull(country_1, 0) as country_1,
       ...
FROM TABLE

Try below试试下面

create temp function  extract_keys(input string) returns array<string> language js as "return Object.keys(JSON.parse(input));";
create temp function  extract_values(input string) returns array<string> language js as "return Object.values(JSON.parse(input));";
select * except(json)
from (
  select json, col, val
  from your_table t,
  unnest([struct(replace(to_json_string(t), ':null', ':0') as json)]),
  unnest(extract_keys(json)) col with offset
  join unnest(extract_values(json)) val with offset
  using(offset)
)
pivot (any_value(val) for col in ('country', 'country_1', 'country_2', 'country_3'))    

if applied to sample data in your question - output is如果应用于您问题中的示例数据 - output 是

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM